Download associative processes - Infant Cognition Group

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Vladimir J. Konečni wikipedia , lookup

Psychophysics wikipedia , lookup

Transcript
Page 1 of 45
CHAPTER 5:
ASSOCIATIVE PROCESSES
Introduction◊ Classical Conditioning ◊ Operant Conditioning ◊ Experimental
Paradigms ◊ Evolved Dispositions ◊ Mechanisms ◊ Extinction
◊ Theories ◊ Neuroscience ◊ Chapter Summary
1. INTRODUCTION
Imagine what would happen in each of the following scenarios: 1) One hot
summer afternoon, you eat mounds of juicy watermelon that tastes delicious. Soon
afterwards, you enjoy a fine dinner of scallops and wine but either the scallops, the wine
or both make you ill so you spend most of the night vomiting. While you are recovering
the next day in the hot sun, a friend brings you a big bowl of watermelon pieces to
quench your thirst. How would you feel? 2) One evening as you are driving home from
work, a small animal darts in front of your car just as you are rounding a corner and
approaching a small bridge. To avoid the animal, you swerve to one side, knocking a
guard rail from the bridge and nearly descending into the riverbed. You pull over to calm
down and ensure that nothing is damaged, then continue on your way. The next evening
you are making the same trip and suddenly realize that you are at the same location. How
would you feel? 3) Every Friday afternoon, you and your friends go for a long, hard
bike ride. After the ride, you always meet at a local restaurant for a bar-b-que meal that
is well-deserved after so much exercise. One evening, the restaurant is particularly busy
so your meal is delayed. While sitting in the restaurant, you can smell the bar-b-que and
even see people at the next table enjoying their meal! How would you feel? Most people
can relate to these examples because they have some experience with each scenario, even
if the specific details differ. In general, people report feeling sick, anxious and
exceptionally hungry in these examples. From what we can surmise, other animals
experience the same sensations under similar circumstances; like humans, they have
formed associations between particular stimuli and particular events in their environment.
1.1 Background
The idea that associative processes have a role in learning dates back at least to
Aristotle (384-322 BCE), who formulated a set of principles describing how ideas or
thoughts are connected. These principles were developed further between the 17thand
19thcenturies by the British Associationists, a group of philosophers expounding the view
that all knowledge is acquired through the senses and that these experiences are held
together by „associations‟. Uncovering the properties of these associations would explain
how knowledge is acquired. The German psychologist Hermann Ebbinghaus (18501909) formally tested the laws put forth by the British Associationists, using himself as a
subject. He generated long lists of nonsense syllables and examined the factors that
improved his ability to remember these lists. To do so, Ebbinghaus manipulated a
number of variables including the length of the lists, the number of times he rehearsed
each list, the time between list exposure and the memory test, as well as the proximity of
a particular item to other items in the list. Ebbinghaus‟ conclusions, published in
Page 2 of 45
Memory (1885), presented a series of formal laws to explain how memories are formed
through associative connections.
Combined with the influence of Pavlov‟s work and the rise of behaviorism, the
study of associative mechanisms flourished in North America during the first half of the
20th century. Much of this work was conducted by experimental psychologists examining
classical or operant conditioning in the lab. These tightly-controlled experiments allowed
researchers to identify causal factors of associative learning, many of which apply to both
vertebrates and invertebrates. During the same period, behavioral ecologists based
largely in Europe, were studying animals in their natural environment as a means to
understand the evolutionary function of associative mechanisms. By the 1960s, there was
a growing realization that these two approaches are complementary (Hollis, 1997). Most
contemporary researchers, therefore, consider both proximate and ultimate explanations
of associative learning, even if they continue to work only in one tradition.
1.2 Chapter Plan
This chapter focuses on the two most-commonly studied forms of associative
learning: classical and operant conditioning. A brief overview of the historical
precedents for studying these two processes, both in the lab and in the natural
environment is presented. Historically, research into these two processes was conducted
primarily in the lab, although there are compelling and important field studies of both
classical and operant conditioning. In order to understand how the majority of this work
is conducted, the primary experimental paradigms in associative learning are described.
This is followed by a discussion of how an animal‟s evolutionary history impacts its
ability to form associations. The mechanisms of classical and operant conditioning, as
well as the phenomenon of extinction, are outlined as a means to understand how
associative learning operates. This leads to an overview of the theories of associative
learning. Finally, as with all cognitive processes, associative learning is represented by
changes in the central nervous system; there is extensive evidence pointing to specific
brain systems that control both classical and operant conditioning. Although the
evidence is not complete, these provide working models for future investigations of
associative mechanisms.
2. CLASSICAL CONDITINOING
Every organism lives in an environment surrounded by hundreds, or even
thousands, of stimuli. Those with motivational significance to an animal will elicit a
behavioral response or, more accurately, a combination of responses. Motivationallysignificant stimuli may be classified as positive (sometimes called appetitive) or aversive:
Appetitive stimuli are those that an animal will work to obtain such as food, water,
access to a sexual partner, etc.; aversive stimuli are those that an animal avoids such as
predators, nausea, painful stimuli, etc.
Classical conditioning is the process whereby stimuli that do not elicit a response
initially, acquire this ability through association with a motivationally-significant
stimulus. The systematic study of this phenomenon began with Pavlov‟s work on the
digestive physiology of dogs. Indeed, the two are so closely connected that the terms
classical and Pavlovian conditioning are often used interchangeably. Pavlov identified
Page 3 of 45
four components in a classical conditioning paradigm: the unconditioned stimulus (US),
the unconditioned response (UR), the conditioned stimulus (CS) and the conditioned
response (CR). Prior to conditioning, the US elicits the UR; this is sometimes called the
unconditioned reflex. Following conditioning trials in which the CS is paired with the
US, the CR is elicited by the CS. This CS-CR connection is sometimes called the
conditioned reflex. In the case of Pavlov‟s experiments, a dog salivating to the sound of
a bell would be a conditioned reflex.
It should not be surprising that classical conditioning has been documented in
almost every species studied. If one stimulus reliably precedes the presentation of
another, organisms can use this information to predict changes in their environment. .
The ability to learn these associations would significantly increase the survival advantage
of any animal. The obvious example is being able to predict where and when food will
be available, or where and when a predator will arrive. But even more complex
behaviors can be influenced by classical conditioning. For example, Hollis and
colleagues demonstrated that classical conditioning improves territorial aggression in
male gourami fish (Hollis et al., 1984). In this experiment, one group of fish learned that
a red light predicted the entry of a rival male to the area. In contrast to fish that did not
learn this association, the conditioned animals exhibited more bites and tailbeating,
almost always winning the fight and retaining their territory. At least in this instance, the
opportunity to predict the impending presence of a rival gave these males a significant
survival advantage.
Figure 5.1Classical conditioning increases territorial aggression in male fish.One group of fish
(PAV) experienced conditioning trials in which a light predicted the entry of a rival male into the
territory. Another group experienced the same number of light presentations and rival entries, but
these were not explicitly paired (UNP). On a subsequent test, the PAV group showed more
aggressive displays to the light. Reprinted from Hollis(1984).
Page 4 of 45
BOX 5.1 Psychoneuroimmunology
As with many scientific „breakthroughs‟, the demonstration that immune responses can be
classically conditioned was a chance discovery. In the early 1970s, Ader and Cohen (1975) were
working with the immunosuppresant, cyclophosphamide. This drug has a wide range of clinical
applications (organ transplants, autoimmune disorders), but it produces a list of terrible side
effects including nausea. Ader wanted to find a solution to this problem so he allowed mice to
drink a flavored saccharin solution prior to an injection of cyclophosphamide. As expected, the
mice developed a conditioned taste aversion to the solution. In order to test the effects of various
anti-nausea agents, Alder force fed the saccharin solution to these mice. The problem? The
animals kept dying. At the time, the explanation was not obvious but it is now clear that the
saccharin solution was suppressing immune responses in animals that had experienced the
saccharin-cyclophosphamide association. Ader demonstrated this experimentally by showing that
mice exposed to the flavored, saccharin water (i.e., the CS) prior to an injection of foreign cells
developed fewer antibodies or a weaker immune response than did mice exposed to water (Ader
et al., 1990).
Subsequent work showed that classical conditioning can also increase immune system activity.
In these experiments, an odor is paired with an injection of the drug, interferon. Interferon
increases the activity of natural killer cells in the bloodstream, which help the body to fight off
infections, viruses and foreign cells. In both mice (Alvarez-Borda et al., 1995) and humans
(Buske-Kirschbaum et al., 1994), an odor CS paired with an interferon injection increased natural
killer cell activity, even in the absence of the drug.
From a rather serendipitous finding that classical conditioning affects immune responses,
psychoneuroimmunology developed and is now a thriving field.
3. OPERANT CONDITIONING
In classical conditioning, the association between the CS and the US is
independent of the animal‟s behavior. That is, a US is presented following the CS,
regardless of how the animal responds to the CS; Pavlov‟s dogs did not need to do
anything in order to receive the food after a bell was sounded. In contrast, the US is
presented in operant conditioning only if the animal performs a response. The
prototypical example of operant conditioning is a rat pressing a lever to receive a food
reward. But any behavior that produces a consequence, either in the lab or the natural
environment, can be described as an operant. This includes a cat pulling a rope to escape
from a puzzle box, a rodent swimming to a hidden platform in a water maze, a bird
pecking at a distinctly colored tree bark that provided a rich food source in the past, or an
animal avoiding a location in which a predator was encountered.
Thorndike, who pioneered the study of operant conditioning in the lab, noted that
animals tend to repeat behaviors that produce satisfying outcomes and refrain from
repeating those that lead to unsatisfying outcomes. This simple idea, called the Law of
Effect, asserts that behavior is controlled by its consequences. This causal relationship
between a response and its consequence can be expressed in one of four ways: 1. Positive
reinforcement describes a positive contingency between a response and a positive
outcome. Giving your dog a treat when he comes to your call, paying workers for
Page 5 of 45
overtime hours, praising children for doing their chores or presenting a sugar pellet to a
rat when it presses a lever are all examples of positive reinforcement. 2. Punishment is a
positive contingency between a response and an aversive event. It includes shocking a
rat for stepping from one compartment to another, slapping your dog for chewing your
slippers, or scolding a child for making a mess or fining a driver for parking in a reserved
spot. 3. A negative contingency between a response and a negative outcome is called
negative reinforcement. If a response is emitted, an aversive event does not occur.
Thus, behaviors governed by negative reinforcement include not touching a hot element
that burned you in the past or avoiding an area where you previously encountered a
predator. 4. Omission training, sometimes called negative punishment, involves a
negative contingency between a response and a positive outcome. Grounding a child for
staying out too late or presenting a food pellet if a rat does not press a lever are both
examples of omission training.
Outcome
Produces
Outcome
Prevents
Outcome
Response
Positive/Appetitive
Negative/Aversive
Positive
Reinforcement
Punishment
Omission
Negative
Reinforcement
Table 5.1: The relationship between a response and its outcome in operant conditioning. An
operant response may produce or prevent an outcome that is either aversive or appetitive, leading
to four different reinforcement relationships.
Operant conditioning is an easy concept to grasp because most people believe that
their actions are governed by consequences. For example, you may study every evening
because you believe that there is an association between working hard and obtaining
good grades. Behaviors that are repeated frequently, however, often become automatic
or habitual. These so-called „habits‟ are responses that are elicited by environmental
stimuli and are relatively insensitive to feedback. Perhaps if you study diligently each
evening, this behavior will become automatic, regardless of how well you perform on
tests and assignments.
4. EXPERIMENTAL PARADIGMS
In order to understand how scientists study associative processes, it is important
to be familiar with the paradigms that are used to measure classical and operant
Page 6 of 45
conditioning. Historically, most laboratory studies used rodents or birds as subjects but
this has changed in the last two decades with the proliferation of molecular biology tools.
These advances have allowed researchers to uncover the genetics of associative learning
in invertebrates, a topic that will be discussed later in this chapter (see Section 9 ). In
contrast, behavioral ecologists throughout the century actively pursued studies of
associative processes in a variety of species. The majority of these focused on classical
conditioning although operant conditioning clearly occurs at high rates in the natural
environment. Regardless of whether scientists are conducting laboratory or field-based
research, almost all studies of associative processes use some variation of the paradigms
described below.
4.1 Classical Conditioning
Conditioned Approach
Because Pavlov was interested in digestive physiology, his dogs had cannula
attached to their salivary ducts that conducted drops of saliva to a data-recording device.
This allowed Pavlov to quantify the CR (i.e., drops of saliva), but it is an impractical and
unnecessary procedure for most classical conditioning experiments. A much simpler
way to measure appetitive conditioning is to examine an animal‟s tendency to approach
and contact a stimulus associated with reward (e.g., food). Brown and Jenkins (1968)
documented one of the first examples of this effect in pigeons that pecked a key light
predicting food presentation. Animals received the food regardless of their behavior so
the pecking was a classically conditioned response. A strikingly similar phenomenon
was observed only two years later by the ethologist, Harvey Croze (1970). He
demonstrated that carrion crows approached and explored empty mussel shells on the
beach (previously signaling no food) when a small piece of beef was placed under the
shell. Even when Croze made the reward more difficult to obtain by burying it in the
sand, the birds rapidly adjusted their foraging behavior to uncover the beef. This ability
to associate previously-neutral stimuli with food has obvious evolutionary significance
and probably explains the ease with which a conditioned approach response is acquired in
both lab pigeons and foraging crows.
Page 7 of 45
Figure 5.2 Search patterns of wild carrion crows. During training, birds learned to associate a
food reward with mussel shells in the sand. During testing, an equal number of mussel and
cockle shells were placed along the beach but there were no food rewards under either. Mussel
and cockle shells differ in color, size and shape so birds can easily distinguish between them.
Top: Crows quickly located mussel shells and when they failed to uncover a food reward, they
spent considerable amount of time digging in the sand around the shell. Bottom: Crows
approached a small number of cockle shells, turned over few of these but did not dig in the sand
around the cockle shell. Reprinted from Croze (1970).
Experimental psychologists capitalize on this facility by measuring approach
responses to stimuli predicting natural rewards such as food, water, or sexual partners as
well as non-natural rewards such as drugs or artificial sweeteners. The latency to
approach the CS, the number of CS approaches within a given time period, and the time
spent in close proximity to the CS are all indices of classical conditioning. Conditioned
place and conditioned taste preference paradigms are common modifications of this
methodology. These tests measure consumption of a flavored food or time spent in a
distinctive environment previously associated with reward. Preference conditioning has
been documented in fish and invertebrate species such as marine snails (Aplysia), worms
(c-elegan) and fruit flies (Drosophila), making it a powerful tool to examine the role of
genetic factors in classical conditioning.
Conditioned Fear
Unlike conditioned approach paradigms, tests of conditioned fear measure learned
responses to aversive stimuli. The particular response that is produced (i.e., the CR) will
depend on the species being tested, and on the US that is used in conditioning. For
example, some animals emit warning signals, such as alarm calls or plumage displays,
which act to deter an attack or warn other animals of danger. If these responses are later
elicited by stimuli that were paired with the danger, we can infer that animals have
formed an association between the CS and US. Other animals will escape or run for
cover when they encounter a predator. Stimuli that signal the sight, sound or smell of
these predators will, over time, elicit conditioned escape responses that can be quantified
by an experimenter. Rodents and other small animals often bury stimuli that were
associated with pain or nausea. So, if they mistakenly touch a sharp object or eat
something that makes them ill, they will respond to these stimuli at a later time by
vigorously digging the dirt or bedding around them to cover the object. Importantly,
burying occurs at a later time when the US is no longer present (i.e., they do not touch the
Page 8 of 45
object or taste the food again). This confirms that the burying is a response to the CS, not
an automatic response to the US itself.
Laboratory test of conditioned fear often capitalize on the fact that frightened
animals tend to freeze. If a neutral stimulus is paired with a fearful stimulus such as a
loud noise or a mild shock that does not cause any injury, animals will freeze in response
to the neutral stimulus (now a CS). The strength of the CR is quantified by measuring the
period of immobility following the CS presentation. The conditioned suppression
paradigm is a variation on this measure in which rodents are initially trained to press a
lever for food. Once lever pressing is stable, classical conditioning trials are instituted in
which a CS (usually a tone or light) is paired with a shock. When animals freeze, they
stop lever pressing so the number of lever presses that occur during the CS, versus the
number of lever presses that occur during non-CS periods, is a measure of conditioned
fear. This suppression ratio is typically calculated as (lever presses during the CS) /
lever presses during the CS plus lever presses during an equal period of time preceding
the CS). Thus, a suppression ratio of 0.5 would indicate no reduction in responding and
no conditioning, whereas a suppression ratio of 0 would reflect complete freezing during
the CS and maximal conditioning.
Figure 5.3 Acquisition of a conditioned suppression response. The conditioned suppression ratio
is calculated as LP during CS / (LP during CS + LP during pre-CS). During the first two trials,
the CS does not suppress lever pressing; as trials progress, lever presses during the CS decline
and the suppression ratio is reduced. Assuming a steady state of responding during non-CS
periods, the data that would generate this curve could be as follows:
Trial 1: LP CS = 20; LP pre-CS=20; Ratio = 20/(20+20) = .5
Trial 2: LP CS = 19; LP pre-CS=20; Ratio = 19/(19+20) = .48
Trial 3: LP CS = 17; LP pre-CS=20; Ratio = 17/(17+20) = .45
Trial 4: LP CS = 11; LP pre-CS=20; Ratio = 11/(11+20) = .35
Trial 5: LP CS = 6; LP pre-CS=20; Ratio = 6/(6+20) = .23
Trial 6: LP CS = 2; LP pre-CS=20; Ratio = 2/(2+20) = .09
LP = lever presses
Page 9 of 45
CS = conditioned stimulus
Conditioned Taste Aversion
Many of you will be able to name a particular food or drink that makes you
nauseous. This conditioned taste aversion (CTA) likely developed because you
consumed the food prior to experiencing some gastric illness, usually vomiting. It does
not matter if the food made you sick, the important thing is forming an association
between the stimulus properties of the food (smell and taste) and your nausea. CTAs are
easily observed in many animals, both in the lab and in the natural environment. After
consuming a flavored substance, animals are made sick by injecting them with a nauseaproducing agent, such as lithium chloride, or exposing them to low-level gamma
radiation. After they recover from the illness, animals are presented with the flavored
food. If animals eat less of this food than another food that was not associated with
illness, we conclude that they have developed a CTA. In some cases, the control
comparison is the amount of flavored food consumed by animals that were not made ill.
CTAs can be very powerful; they often develop with a single CS-US pairing and
are sustained for long periods of time. This should not be surprising as poison-avoidance
learning is critical for survival. Animals can become very sick or die if they consume
inedible items, so they must learn to avoid poisons and other toxins by associating the
smell and taste of the food with illness. This is one reason that it is so difficult to develop
effective poisons for rodents. After sampling a very small amount of the novel food, rats
and mice feel ill and avoid this food in the future.
Note that the measure of classical conditioning (the CR) varies across these three
types of paradigms. Given that there are dozens, or more accurately hundreds, of other
classical conditioning tests, the number of ways in which scientists can measure this
process is almost endless. In addition, for each CR, a researcher may choose to measure
how quickly it is acquired, how large the response is once conditioning has occurred, and
how long it lasts when the US is no longer present. This sometimes makes it tricky to
compare the magnitude of conditioning across studies. Researchers (and students reading
these studies) must pay close attention to the dependent measure in each study because
discrepancies in research findings can sometimes be explained by differences in how the
CR is assessed.
4.2 Operant Conditioning
Discrete Trials
The cats in Thorndike‟s experiments were required to pull a rope or move a stick
or push a board to escape from a Puzzle Box. Operant conditioning (or trial and error
learning as Thorndike called it) was evidenced by a decrease in the escape time over
trials. This discrete trials setup, in which subjects have the opportunity to make one
correct response for each time they are placed in a testing apparatus, is still common in
lab-based research. Maze experiments, such as the water maze, T-maze, or straight
alleyway, are all operant conditioning paradigms, although most of these tests are used to
assess other cognitive processes such as spatial learning or decision making. One of the
simplest discrete trials measures of operant conditioning is the conditioned avoidance
paradigm. A rat or other small animal is shocked for stepping off a platform to a grid
floor; longer latencies to step from the platform on subsequent trials indicate better
conditioning. Many people will note that this test sounds remarkably similar to the
Page 10 of 45
conditioned escape or conditioned freezing paradigms described above. Certainly both
require animals to form associations between aversive events and the stimuli that predict
them. The main difference is that the presentation of the shock in the conditioned
avoidance paradigm depends on the animal‟s behavior: if they do not step off the
platform, they will not receive a shock. In classical conditioning, the US follows the CS
regardless of what the animal does. Obviously we cannot control how animals behave in
the natural environment, which is one reason that there are so few ecological studies of
operant conditioning.
Free Operant
When most students and researchers think about operant conditioning, they
imagine a rat pressing a lever for food in a small chamber. This setup was designed by
B.F. Skinner in the 1940s as a means to evaluate the on-going responses of his subjects
(usually pigeons or rats). Animals are placed in a chamber containing some
manipulandum, usually a lever, which can deliver a reward (also called a US). In
contrast to the discrete trials paradigms, free operant methods allow animals to respond
repeatedly (freely) once they are placed in the experimental chamber. One advantage of
this method is that the experimenter can observe variations in responding across time.
This information is represented on a cumulative response record. The vertical distance
(y-axis) on the graph represents the total number of responses in the session and the
distance along the x-axis indicates time. Thus, the cumulative record provides a visual
representation of when and how frequently the animal responds during a session.
Page 11 of 45
Figure 5.4 Cumulative response records for rats responding for methamphetamine (c and d) or
food (e and f). Food reinforcement produced a steady rate of responding over time whereas
methamphetamine produced a step-like pattern in which a drug infusion (i.e., presentation of the
reinforcer) was followed by a pause in responding (c). When animals are responding for drug,
they load up at the beginning of the session, presumably to attain an optimal level of drug in their
system (c). This effect is not observed when animals are responding for food (e). Administration
of the glutamate antagonist MTEP (d and f) led to a near-complete cessation of
methamphetamine-reinforced responding (d) but had no effect on responding for food (f). The
rapid lever presses that occur at the beginning of the methamphetamine session (d) indicate that
animals are „expecting‟ a drug infusion and may be frustrated by the lack of drug effect. Animals
stopped responding for over an hour, tried the lever a few times, and stopped responding for
another 40 minutes. This pattern is consistent with the idea that MTEP is blocking the
reinforcing effect of methamphetamine. Reprinted from Gasset al. (2009).
In free operant paradigms, the relationship between responding and reinforcement
is described by the reinforcement schedule. This is a rule (set by the experimenter) that
determines how and when a response will be followed by a reinforcer. When every
response produces a reinforcer, the schedule is called continuous reinforcement or CRF.
More commonly, responding is reinforced on a partial or intermittent schedule. One of
the simplest ways to produce partial reinforcement is to require animals to make a certain
number of responses for each reinforcer. If the number is set, the schedule is called fixed
ratio (FR); if the number of required responses varies about a mean value , the schedule
is called variable ratio (VR). Thus, FR5, FR10 and FR50 schedules require animals to
make exactly 5, 10 and 50 responses for the reinforcer whereas VR5, VR10 and VR50
schedules require animals to make an average of 5, 10 and 50 responses. A progressive
ratio (PR) schedule is a variation of an FR schedule in which animals must make an
increasing number of responses for successive presentations of the reinforcer (Hodos,
1961). Typically, the schedule is set up so that animals make one, then two, then four,
then eight responses for the first four reinforcers. The PR values continue to increase
until animals stop responding altogether. This break point is a measure of motivation or
how hard animals will work for a single presentation of the reinforcer.
In contrast to ratio schedules, interval schedules provide reinforcement if a
response occurs after a certain period of time. Under fixed interval (FI) schedules, the
time from the presentation of one reinforcer to the possibility of receiving the next is
constant. Under variable interval (VI) schedules, responding is reinforced after an
average time interval has passed. For example, animals responding under anFI-15s
schedule would be reinforced for the first response they make 15 seconds after the last
reinforcer. Under a VI-15s schedule, reinforcement would be available on average 15
seconds after the delivery of the last reinforcer. Note that animals must still respond
under interval schedules, but they are only reinforced for responses that occur after the
interval has elapsed. In Chapter 9, you will read more about how these techniques have
been used in very clever examinations of the timing and counting abilities of animals.
Situations in which a reinforcer is presented at specified intervals, regardless of the
animals‟ behavior, are called time schedules.
Schedules of reinforcement induce different patterns of responding, suggesting
that animals have some knowledge of the payoffs provided by each schedule. For
Page 12 of 45
example, acquisition is usually more rapid under CRF than partial reinforcement
schedules, and responding declines more quickly when the reinforcer is removed. This
should not be surprising. If an organism is always reinforced for a particular behaviour,
it will be easier to associate the behavior with the outcome and it will stop responding
when the reinforer is removed. If the behaviour is reinforced only part of the time, it may
take longer to learn the association and animal could develop a strategy to „keep trying‟
to receive the reinforcer.
Even under partial reinforcement schedules, different patterns of responding
emerge. FR schedules are characterized by high and steady rates of responding up to the
presentation of the reinforcer. Responding ceases after reinforcement is delivered and
this post reinforcement pause increases with the size of the ratio. With very high FR
ratios, animals may stop responding altogether. FI schedules also induce postreinforcement pauses that vary directly with the length of the interval, presumably
because animals learn that the reinforcer will not be available for a certain period of time.
Responding recovers close to the completion of the interval with the rate increasing
exponentially until the next reinforcer is obtained. This produces a 'scallop' in
responding, as opposed to the relatively steady rate of responding that occurs under FR
schedules. Not surprisingly, variable schedules of reinforcement produce less predictable
patterns of responding than do fixed schedules. In general, response rates are steady
under variable schedules, reflecting the fact that animals do not know when the next
reinforcer will be delivered. Rapid bursts of responding often occur because animals
have experienced occasions in which a few fast responses produce several reinforcers in a
row. When the reinforcement payoff is low (i.e., high VR or VI schedules), response
rates are reduced, responding becomes more sporadic and pauses occur at irregular
intervals. This makes sense in terms of what the animal has learned: reinforcement is
infrequent and unpredictable.
Figure 5.5 Cumulative response records under different schedules of reinforcement. The delivery
of a reinforcer is marked by a vertical slash. Different schedules of reinforcement elicit different
patterns of responding and, not surprisingly, FI responding in humans is very accurate if they can
use clocks to time the intervals (bottom line to right of figure). Redrawn from Baldwin and
Baldwin (2001).
Page 13 of 45
VR=variable ratio
VI=variable interval
FR=fixed ratio
FI=fixed interval
Schedules of reinforcement are a convenient tool to elicit different patterns of
responding in the lab, but they have an important application to the real world. That is,
ratio and interval schedules are analogies for how different commodities are depleted and
replenished in the natural environment. Some food sources, such as prey for predator
animals, will disappear on an item-by-item basis if they are consumed, whereas organic
food sources replenish with time. These represent ratio and interval schedules
respectively. Animals that understand the payoffs provided by these sources will be
better equipped to survive in their environment because they are less likely to deplete
resources beyond the point that they can be replenished. Ironically, this appears to be
exactly what humans have not learned about the environment.
Page 14 of 45
BOX 5.2 Superstitious Behavior
The defining feature of operant conditioning is that the presentation of the reinforcer depends on
the animal‟s response. But, what if an animal thinks that its behavior is causing an outcome?
Will responding change accordingly? Skinner (1948) tested this idea by presenting noncontingent food to pigeons at regular intervals. Over trials, the birds developed responses that
preceded the food delivery, but these differed across birds: one bird turned in counterclockwise
circles, one made pecking motions at the floor, two swung their heads back and forth, one raised
its head to one corner of the cage and another bobbed its head up and down. Skinner surmised
that each of these animals had been engaged in a particular behavior (i.e., head bobbing) when the
food became available. This signaled an operant contingency to the animal that was reinforced
on subsequent trials when the bird performed the response and the food was then delivered.
Skinner described these responses as superstitious behavior and noted that they are maintained
because animals form a response-outcome association, even though there is no causal relationship
between the two. Similar effects were observed in other animals (including humans) that were
tested in the lab but, of course, the story is not so simple in the real world. Not all coincidental
occurrences of a response and a reinforcer lead to superstitious behavior, so researchers began to
investigate the factors that produce this false belief (Gmelch & Felson, 1980; Vyse 1997). These
studies concluded that superstitious behaviors are maximized under the following conditions: 1)
an unusual and novel behavior precedes the presentation of a highly-valued reward; 2) the cost of
engaging in the behavior is very low but the payoff is potentially very high; and 3) the attainment
of the reward is not completely under the individual‟s control.
In humans, superstitious behaviors may be maintained because they create an illusion of control,
particularly in situations that induce high levels of stress and anxiety. This describes high-level
sports or artistic endeavors and both athletes and musicians often engage in specific rituals prior
to a competition or performance. These could include putting on a particular piece of clothing,
eating a particular food, or entering the theatre or arena in a particular way. Even if the individual
acknowledges that these rituals may not cause their good performance, they are reluctant to
abandon them. After all, there is no downside to wearing a particular piece of clothing. And
what if it works? Regardless of what the performers or athletes may think, these are superstitious
behaviors that are maintained for exactly the same reason that Skinner‟s pigeon‟s continued to
bob their head or scratch the floor before the food arrived.
Page 15 of 45
5. EVOLVED DISPOSITIONS
Many psychologists in the first half of the 20th century adhered to the principle of
equipotentiality. This position assumed that associations between different stimuli,
responses and reinforcers could be formed with equal ease. For example, Pavlov noted
that any stimulus could be used as a CS in his experiments, and Skinner claimed that
animals could learn any operant response (as long as it was physically possible) for any
reinforcer. According to these scientists, the results of their experiments could be
generalized to all instances of associative learning (and many believed to all cases of
learning). We now recognize that this is not true: some associations are easy to learn and
others are not. For example, a CTA often develops following a single CS-US pairing
and, as many people are aware, may last for months or even years. In contrast, some
stimuli never elicit a response, even if they are paired with a US on hundreds or
thousands of occasions. The relative ease with which animals acquire certain
associations is referred to as evolved dispositions (Shettleworth, 1998), reflecting the
idea that learning an association between two stimuli has conferred some evolutionary
advantage on the species.
One of the most elegant examples of evolved dispositions comes from an
experiment by Garcia and Koelling (1966). In the first part of the experiment, rats were
presented with a drinking tube containing flavored water. Every time the rats licked the
tube, a brief audiovisual stimulus was presented that consisted of a clicking sound and a
flash of light. Animals were then either shocked or made ill by x-ray treatment. Thus, all
rats experienced the same CS (flavor plus light-sound combination) with half receiving a
shock US and half receiving an the x-ray treatment that made them ill. In the subsequent
test, animals were presented with two licking tubes, one that contained flavored water
and one that contained plain water linked to the light-sound cue.
Figure 5.6 Experimental design for Garcia and Koelling‟s experiment showing evolved
dispositions. The CS was a compound stimulus consisting of a flavor and an audiovisual
stimulus. The US was either a shock or illness. During testing, responses to the two CSs (flavor
or audiovisual stimulus) were tested separately. Reprinted from Domjam (2003).
Page 16 of 45
The group that were made ill avoided the flavored water and drank from the plain
water tube that activated the audiovisual cue. In contrast, the shocked group avoided the
tube linked to the audiovisual cue and drank the flavored water. The interpretation of
these results is that rats have a tendency to associate a flavor with illness and a lightsound stimulus with shock. The same effect is observed in 1 day old rat pups
(Gemberling & Domjam 1982) suggesting that the disposition is innate, rather than
acquired through experience.
Figure 5.7 Results of Garcia and Koelling‟s experiment demonstrating evolved dispositions. The
bars indicate how much rats licked a drinking tube that contained a flavored solution (taste) or
produced an audiovisual stimulus. Rats that experienced the illness US licked the tube containing
the flavored solution less frequently, whereas those that experienced the shock US showed fewer
licks of the tube that produced the audiovisual stimulus. Reprinted from Domjam (2003).
In the natural environment, gustatory cues are a good predictor of dangerous food,
whereas audio-visual cues are a good predictor of environmental danger such as
predation. Animals who learned the predictive significance of these stimuli would be
more likely to survive and reproduce, explaining why contemporary animals acquire
these associations so easily. Evolved dispositions also account for cross-species
differences in CTA learning. Rodents easily acquire flavour-illness, but not colourillness, associations whereas quail develop both. Unlike rats and mice, quail rely heavily
on vision when searching for food. This also explains the relative ease with which birds
associate visual, but not auditory cues, with a food reward.
Thorndike was ahead of his time in recognizing the importance of evolved
dispositions in operant conditioning. He tested whether cats could be trained to yawn or
scratch themselves in order to escape from a Puzzle Box, concluding that these
associations did not „belong‟ together in the animal‟s evolutionary history. This
foreshadowed many unsuccessful attempts to train animals in operant paradigms. Such
examples include Herschberger‟s attempt to train chicks to run away from a food bowl to
receive a reward, Bolles‟ attempts to train rats to stand on their hind legs to avoid a
Page 17 of 45
shock, and the Brelands‟ difficulty in training animals to perform circus tricks if the
trained response was incompatible with the animal‟s natural behavior. All of these
researchers discussed their negative findings in the context of evolution: operant
responses that are contrary to adaptive behaviors will be difficult to acquire. Given that
approaching food-related stimuli and escaping from shock would confer some
evolutionary advantage, trying to condition animals against these dispositions is
incredibly difficult.
The corollary is that response-reinforcer associations which enhance species‟
survival are easily acquired. Sevenster (1973) demonstrated this principle in male
sickleback fish that were trained to bite a rod or swim through a ring to gain access to
another fish. When males could gain access to another male, the biting increased but
swimming through a ring did not. The opposite occurred with access to a female:
swimming through a ring increased, but biting the rod did not. The finding that access to
another male fish is an effective reinforcer for biting, and that access to a female is an
effective reinforcer for ring swimming, fits with the animal‟s evolutionary history. Biting
is a component of the aggressive behavior that occurs when a resident male encounters an
intruder, whereas swimming through a ring is more characteristic of the swimming
patterns during fish courting behavior.
Evolved dispositions may also explain some irrational fears in humans. People
readily acquire exaggerated fear responses to stimuli associated with threats in the natural
environment, and many phobias likely stem from an evolved disposition to fear objects or
situations that were dangerous to our ancestors. This explains why humans and monkeys
associate pictures of snakes and spiders with shock more readily than pictures of flowers
and houses (Ohman, Dimberg & Ost, 1985). Similarly, although it is a very real and
immediate danger, children have great difficulty learning not to run in front of cars. The
same children will often refuse to enter a dark room on their own, even if the room is
familiar and they receive repeated assurances that no monsters are hiding in the dark.
By the end of the 20th century, most scientists accepted that evolved dispositions
affect associative learning. Many researchers still focus exclusively on proximate causes
of classical or operant conditioning, but their research questions are often framed within
an evolutionary perspective. For example, one may investigate which brain regions
mediate associative learning, and then ask why these neural systems are preserved across
evolution. If nothing else, researchers can capitalize on the fact that it is much easier to
train animals in classical or operant tasks when one works with, rather than against,
evolved dispositions.
6. MECHANISMS
Evolved dispositions provide a functional explanation of associative learning in
that being able to predict when one stimulus follows another or which outcome will
follow a response should help animals to survive and reproduce. In contrast, proximate
explanations describe associative learning in terms of the causal factors that produce
optimal conditioning. These include physiological mechanisms underlying associative
learning, a topic that will be discussed in Section 9 of this chapter. Many other proximate
factors of associative learning have been identified. Of these, predictiveness, temporal
contiguity, and stimulus salience are the most important.
Page 18 of 45
6.1 Predictiveness
It seems intuitive that the more frequently two stimuli are paired, the stronger will
be the association between them. Contiguity alone, however, does not produce
associative learning. The CS must reliably predict the US in classical conditioning, and
the response must reliably produce the reinforcer in operant conditioning, or conditioning
will not occur. The realization that predictiveness is an important determinant in classical
conditioning came about following a classic experiment by Leon Kamin (1969). In this
study, one group of rats underwent a classical conditioning protocol in which a CS (tone)
preceded a shock (US). As expected, a freezing response (CR) developed to the tone.
Following these trials, a light was presented at the same time as the tone and both were
followed by the US. Stimuli presented simultaneous are called compound stimuli and are
labeled individually as CS1 (tone), CS2 (light), etc. A separate group of control rats
experienced CS-US pairings with the compound stimulus (tone and light combined) but
had no prior exposure to either stimulus alone.
Blocking Group
Control Group
simple conditioning
CS1 – US
No treatment
compound conditioning
CS2+CS1 – US
CS2+CS1 – US
Test
CS2?
CS2?
Figure 5.8 Experimental design for Kamin‟s blocking experiment. The blocking group
experienced simple classical conditioning trials followed by compound conditioning trials. The
control group experienced the compound conditioning trials only.
CS=conditioned stimulus
US=unconditioned stimulus.
In the test session, Kamin compared CR‟s to the light in the experimental and
control groups. Note that all animals experienced exactly the same number of trials in
which the light preceded the shock. Thus, if the frequency of CS-US pairings alone
determines classical conditioning, the CR should be the same in control and experimental
groups. Of course, this is not what happened. Whereas the control group exhibited
robust freezing to the light, the experimental group did not. In Kamin‟s terms, the initial
tone-shock pairings blocked subsequent conditioning to the light. He discussed his
findings in terms of informativeness, noting that the animals‟ previous experience with
the tone made the light irrelevant as a predictor of the US. Thus, in a blocking
experiment, CS2 conveys no new information about the occurrence of the US so
conditioning does not occur.
Page 19 of 45
Figure 5.9 Results of Kamin‟s (1969) blocking experiment. Conditioning to CS2 was measured
in a conditioned suppression experiment with the blocking group showing almost no suppression
of responding (i.e., no freezing to the light).In contrast, the control group showed marked
conditioned suppression indicating that, unlike the blocking group, they had formed a CS2-US
association.
CER = conditioned emotional response
Kamin‟s experiment has been replicated hundreds (perhaps thousands) of times
with different stimuli and different organisms, confirming that the frequency of CS-US
pairings can not, in itself, explain classical conditioning. Interestingly, the phenomenon
may be limited to vertebrates as insects (at least bees) do not exhibit blocking (Bitterman,
1996) despite the fact that classical conditioning is acquired easily by these animals.
Another way to reduce the predictive value of a CS is to present it alone, prior to
any CS-US pairings. This describes the phenomenon of latent inhibition in which
previous exposure to the CS, in the absence of the US, retards subsequent conditioning to
the CS. One can think of latent inhibition as habituation to a novel stimulus (see Section
X, Chapter 2). Organisms first learn that the CS has no motivational significance; they
must then inhibit or overcome this information when the CS is presented with the US at a
later time. Both blocking and latent inhibition fulfill an important biological function in
that they limit cognitive processing of stimuli that are meaningless to the organism.
6.2 Temporal Contiguity
It seems intuitive that it would be easier to form an association between two
stimuli if they occur close together in time. Many laboratory experiments confirmed this
assumption: a US that immediately follows a CS and a response that immediately
produces a reinforcer induce robust conditioning. Researchers went on to demonstrate
that classical conditioning is reduced when the US is delayed because stimuli present
during the intervening interval become better predictors of the US. The same appears to
be true in operant conditioning. Animals can learn an operant response for a delayed
Page 20 of 45
reinforcer but the longer the interval to the reinforcer, the more likely it is that animals
will form associations between other stimuli and the reinforcer (Dickinson, 1980). This
serves to weaken the response-reinforcer association by making the response less
predictive of the outcome.
The problem with this general principle of temporal contiguity is that it does not
apply to all cases of associative learning. CTA learning is the notable exception. CTAs
develop with very long CS-US (taste-nausea) intervals, up to a few hours in rodents and
even longer in humans. Any ill effects related to eating would occur after the food is
digested, absorbed in the bloodstream and distributed to bodily tissues. Organisms must
be able to retain information about the stimulus properties of food over a long interval if
they are to later avoid food that made them sick. This is a prime example of how
ultimate explanations (evolved dispositions) influence proximate explanations (temporal
contiguity) of associative learning.
Even within the same paradigm, changing the CS-US interval alters how animals
respond. Timberlake (1984) demonstrated this effect in a classical conditioning
experiment with rats. When a light CS predicted a food reward at very short intervals
(less than 2 seconds), rats developed a CR of handling and gnawing the CS. When the
US occurred more than 5 seconds after the CS, rats developed a conditioned foraging
response as if they are searching for the food (Timberlake 1984). The fact that a different
CR developed under these two conditions is evidence that rats were learning the
predictive temporal relationship between the CS and the US.
The relationship between temporal contiguity and the development of conditioned
responding provides further evidence for the biological relevance of associative learning.
If animals use associations to make predictive inferences about their world, then the
temporal intervals that conform to CS-US or response-reinforcer relationships in the
natural environment should produce the best conditioning.
6.3 Stimulus Salience
Even if one controls for temporal contiguity and predictiveness, the rate and
magnitude of conditioning to different stimuli may vary. This is true in both the lab and
the natural environment, but is illustrated most effectively by thinking about a typical
classical conditioning experiment. The arrival of the experimenter (often at the same
time each day), the placement of the animal in a testing apparatus, the sound of
automated equipment starting an experiment, as well as other extraneous cues may all
become effective CSs because they ultimately predict the presentation of the US.
Researchers attempt to control for these confounds but, even in tightly-controlled lab
studies, it is difficult to eliminate cues that are not explicitly part of the experimental
design. The issue is even more complicated in the natural environment where almost
every US is preceded by a cluster of CS‟s. The technical term for this phenomenon is
overshadowing: one stimulus acquires better conditioning than other stimuli in the
environment, even if they are equal predictors of the US. In blocking, a stronger
association with one CS develops because it was presented first, whereas in
overshadowing a strong associations develops to a CS because it is more salient.
The most obvious explanation for overshadowing is that animals notice or pay
attention to one stimulus at the expense of the others. This is often described as stimulus
salience, which is the likelihood that a stimulus will be attended to. Salience is often
Page 21 of 45
equated with the conspicuousness of a stimulus, or how well it stands out from the other
background stimuli. In general, salience increases with the intensity of the stimulus:
brighter lights, louder noises or stronger smells attract more attention. It is important to
remember, however, that salience is not a fixed property of the stimulus. As we learned
in Chapter 3, the ability of an organism to attend to a particular stimulus depends on the
organism‟s sensory system and on perceptual processing. The salience of a stimulus can
also be increased by altering the motivational state of the animal. Food-related cues, such
as cooking aromas, are far more salient when you are hungry than when you are sated!
Not surprisingly, one of the best ways to increase the salience of a stimulus is to make it
similar to cues that animals encounter in their natural environment. Male quail will
develop a conditioned sexual response to a CS, such as a light or terrycloth object, that
predicts access to a female quail (US). If this arbitrary cue is made more realistic by
having it resemble a female quail, more vigorous responding develops to the CS (Cusato
& Domjam, 1998).
Figure 5.10 Naturalistic and artificial stimuli in sexual conditioning experiments. The stimulus
on the left is made of terrycloth and only resembles the general shape of a female quail. The
stimulus on the right was prepared with head and neck feathers from a taxidermically prepared
bird. Reprinted from Cusato & Domjam (1998).
Stimulus salience impacts operant conditioning in the same way that it affects
classical conditioning. Animals are much quicker to acquire responses to stimuli that are
salient, and responding is increased when the stimuli have biological significance to the
animal.
7. EXTINCTION
In both classical and operant conditioning, responses will decline if the US is no
longer presented. For example, birds will eventually stop approaching shells that covered
a food reward if the shells are now empty, and rats will stop lever pressing for food if the
food is withheld. This gradual reduction in responding, with removal of the US, is called
extinction. It may seem intuitive that animals forget or erase the association that was
acquired during conditioning, but we know this is not how extinction occurs. For one
Page 22 of 45
thing, a CR will reappear if a delay period follows extinction, even when the organism
has no further experience with the US. This spontaneous recovery indicates that the
original association is still available to the animal. Second, if a novel stimulus is
presented during extinction, the response rapidly recovers. For example, a dog will
develop a conditioned salivation response to a bell that predicts food; when the bell is
presented without the food, salivation declines. If a new stimulus, such as a light, is then
presented with the bell, the dog will salivate again. The same phenomenon occurs in
operant conditioning when rats re-initiate lever pressing in the presence of a loud noise.
In some cases, the renewed response may be as robust as the pre-extinction response.
This phenomenon is called disinhibition to reflect the idea that the novel stimulus is
disrupting a process that actively inhibits the original association (CS-US in classical
conditioning and response-US in operant conditioning). Third, extinction is context
specific in that responding is inhibited only in the environment in which extinction trials
occurred. In the laboratory, the environmental context can be altered by changing the
flooring, the wall patterns, adding odors, etc. If extinction trials are conducted in a new
context, the response declines but then re-emerges when the CS is presented in the
original context. This response renewal (sometimes called reacquisition) is another
piece of evidence that CS-US associations are inhibited, not eliminated, during
extinction. In other words, extinction is not simply forgetting.
Response Strength
Reacquisition
Acquisition
Extinction
Spontaneous
recovery
Resp onse strength
Extinction
Trials
Figure 5.11 Hypothetical data showing changes in responding under reinforcement (acquisition)
and extinction conditions. Note that these changes could equally apply to operant or classical
conditioning.
Page 23 of 45
8. THEORIES
Few people would disagree that classical and operant conditioning involve
associative learning. Nonetheless, there was considerable debate in the first half of the
20th century over what associations underlie the changes in behavior. Some researchers
argued that animals form stimulus-stimulus (S-S) associations during conditioning, others
that they learn stimulus-response (S-R) associations, and still others that responseoutcome (R-O) associations control responding. A number of very clever experiments
were designed to tease apart these hypotheses but, as frequently occurs with such issues,
the answer lies somewhere in between.
8.1
What Associations are Formed in Classical Conditioning?
Pavlov was the first to argue that animals learn stimulus-stimulus associations, an
effect he called stimulus substitution. According to this position, a connection is formed
between the CS and the US such that the CS becomes a substitute for the US.
Conditioned responding develops because the CS elicits a representation of the US, to
which the animal responds. One of the best pieces of evidence that animals can form
stimulus-stimulus associations comes from sensory preconditioning experiments. In
this experimental set-up, two neutral stimuli (CS1 and CS2) are presented together with
no US. At this point, both stimuli are motivationally neutral so no observable response
(CR or UR) is elicited. Then, CS1 is followed by a US in standard classical conditioning
trials. After a CR is established, CS2 is presented alone. If CS2 elicits a CR, animals
must have formed a stimulus-stimulus association (CS1-CS2) prior to conditioning trials.
Phase
Protocol
Stimuli
Response
pre-conditioning
CS1 – CS2
light - tone
nothing
conditioning
CS2 - US
tone-food
salivation
Test
CS1?
light?
reduced salivation
Figure 5.12 Experimental design for a sensory preconditioning study. If the light elicits a
salivation response during the test, it indicates that an association was formed between the light
and the tone during pre-conditioning trials
Devaluation experiments also support S-S accounts of classical conditioning.
The logic behind a devaluation experiment is as follows: if the CS is associated with the
US, changing the value of the US should change responding to the CS. Devaluation is
examined empirically using a 3-stage experiment. First, the animal experiences CS-US
Page 24 of 45
pairings (e.g., tone followed by food) until a CR develops (e.g., salivation); second, the
US is devalued (e.g., food followed by sickness); and third, the CS is presented in the
absence of the US (e.g., tone with no food). Note that the animals have never
experienced the tone and sickness together. Thus, if the CR (e.g., salivation) is reduced,
compared to a group that did not experience the devaluation, one can conclude that they
learned an S-S association during conditioning trials.
Sensory preconditioning and devaluation effects have been demonstrated in
hundreds of experiments using dozens of different species. Despite this evidence, S-S
theories of classical conditioning were not accepted by all researchers. Many noted that
these theories make no prediction about what CR will develop across conditioning. Each
US may elicit a variety of hormonal, emotional and behavioral responses, so a theory that
does not anticipate which of these will become conditioned responses has limited utility.
S-R theories deal with this uncertainty by proposing that classical conditioning reflects
the development of CS-UR associations. This infers that the CS should elicit a response
that mimics the UR. Compelling evidence for this position was provided by Jenkins and
Moore (1973) who trained pigeons to associate a light with the presentation of either food
or water. Pigeons developed a classically conditioned response of pecking the light but
the characteristics of the peck varied with the US. A water US produced a CR that was a
slow pecking, with the beak closed and often accompanied by swallowing. In contrast, a
food US produced a CR that was a sharp, vigorous peck with the beak open at the
moment of contact, as if the pigeons were pecking at grains of food. It seemed that the
animals were attempting to „drink‟ or „eat‟ the CS, seemingly confirming the idea that the
CR is a reduced version of the UR.
Figure 5.13 Pigeons pecking a key that predicted a reward. The pigeon on the left was trained
with a water US and the pigeon on the right with a food US. The water-trained pigeon pecks at a
slow rate with its beak closed and swallows frequently, responses that mimic drinking. The foodtrained pigeon pecks with an open beak at a more rapid rate as if it is attempting to eat the light.
Reprinted from Jenkins and Moore (1973).
Page 25 of 45
Although other data seemingly confirmed that S-R associations are important in
classical conditioning, once again, contradictions arose. The most problematic was
evidence that the CR does not always mimic the UR. Indeed, in some cases, the CR is
opposite to the UR. For example, a mild shock increases heart rate, whereas a CS that
predicts the shock decreases heart rate (Hilgard, 1936). These data also cause problems
for S-S theories of classical conditioning because they are not consistent with the idea
that the CS becomes a substitute for the US.
Because of these problems, a third group of theories were developed that
explained classical conditioning as adaptive responses to the upcoming US. These
preparatory theories argued that the CR mimics the UR when this is the best preparation
for the US (e.g., eyeblink to a puff of air or salivation to food). When the more adaptive
response is to counter the US, the CR and UR will be in opposite directions (as in the
heart rate example above). Preparatory theories provide a functional explanation of
associative learning by suggesting that classical conditioning evolved to help organisms
prepare for the appearance of motivationally-significant events.
Preparatory theories provide an adequate explanation for many classicallyconditioned responses, suggesting the existence of CR-US associations. But this does not
mean that animals don’t learn CS-US or CS-UR associations during conditioning. In all
likelihood, they learn all three. Each of these associations provides different information
to the animal about the relationships in their environment and animals that can use this
information adaptively are more likely to survive and reproduce. One way to
conceptualize the relationship between these three associations is to suggest that
cognitive representations of CS-US associations develop during conditioning, and that
this knowledge is translated into behavior through CS-UR and CR-US associations.
Page 26 of 45
BOX 5.3 Preparatory Responses and Drug Tolerance
Drug tolerance occurs when higher and higher doses of a drug are required to get the same effect.
Tolerance develops rapidly to many opioid effects so some drug addicts regularly inject a dose of
heroin that is 30 or 40 times higher than a dose that would kill most people. Drug tolerance can
not be explained, entirely, by the pharmacological properties of the drug because animals and
humans with the same level of drug exposure exhibit very different levels of tolerance. One
suggestion is that cues associated with the drug act as a CS that elicits preparatory responses to
the drug (Siegel, 1983). Tolerance develops because the association between these stimuli and
the injection becomes stronger and the compensatory mechanisms became better at countering the
effects of the drug. One prediction of this theory is that tolerance should be stronger when the
drug injection always occurs in the same environment (i.e., the same cues reliably predict the
injection).
Siegel and colleagues tested this hypothesis in rats that were repeatedly injected with heroin in
one of two distinct environments. A control group of rats received sucrose/water injections. The
following day, the dose of heroin was doubled for all animals. One group was injected in the
same environment that they received the original injections and one group in a different
environment. The dependent measure was the number of overdoses in each group.
Fig. 5.14. Results of Siegel‟s experiment on morphine tolerance. Only 32% of the animals died
when they were injected with the higher dose in the same room as they received the original
injections; twice as many animals died when they were injected in a different room. Almost all of
the control animals were killed by this larger dose (Siegel et al., 1982). The explanation for this
phenomenon is that the CSs in the „same room‟ environment induced compensatory responses
that opposed the drug effects. These findings help to explain why drug tolerance is context
dependent and why drug addicts often overdose in new environments, even when they administer
a dose that is similar to the amount that they take regularly. If the cues that signal the injection
are not present, preparatory responses will not be set in motion to reduce the lethal effects of the
drug.
Page 27 of 45
8.2 What Associations are Formed in Operant Conditioning?
There are three fundamental elements in operant conditioning: the response (R),
the reinforcer or outcome (O) and the stimuli (S) which signal when the outcome will be
available. (Reinforcer and outcome may be used interchangeably; to avoid confusion
between R for response and R for reinforcer, we use O for outcome.) As with classical
conditioning, researchers in the 20th century argued about how these elements may be
associated and, more importantly, which are responsible for the development of operant
conditioning.
Thorndike was the first S-R theorist, arguing that cats in his puzzle boxes formed
associations between stimuli in the box and the operant response that led to the escape.
The outcome (in this case escape) increased subsequent responding because it
strengthened the S-R association. Thorndike formulated this principle into a Law of
Effect which stated that “if a response in the presence of a stimulus is followed by a
satisfying event, the association between the stimulus and the response is strengthened”
(Thorndike, 1911). Later researchers, most notably Hull (1930), used the term habit
learning to describe the tendency to perform a particular response in the presence of a
particular stimulus. According to him, the strength of the habit was a function of the
number of times that the S-R sequence was followed by a reinforcer. As habit strength
increased, the probability that a subject would perform the given response in the presence
of the appropriate stimulus also increased.
Tolman was one of the strongest critics of Hull‟s work, arguing that S-R theories
turned animals (including humans) into automata, with no understanding of how their
behavior changed the environment. His view of operant conditioning was that animals
form associations between their response and the outcome that it produces. This is
nothing more complicated than saying you understand what will happen when you do
something. The problem for scientists during Tolman‟s time was that R-O theories like
his require animals to have mental representations, both of their response and of the goal
that they wish to achieve, a position that many were reluctant to adopt.
A resolution to the S-R versus R-O debate in operant conditioning is provided by
devaluation experiments. Recall the devaluation protocol in classical conditioning (see
Section 8.1, above). The same procedure is used in operant conditioning with the
outcome being devalued independently of the operant contingency. In the example
below, animals learn to lever press for food, the food is associated with illness and lever
pressing responses are tested at a later time.
Phase
Devaluation Group
Control Group
training
response-outcome
response-outcome
devaluation
Page 28 of 45
Test
outcome- aversion
nothing
response?
response?
Figure 5.15 Experimental design for a devaluation study. Animals in both the devaluation and
control groups are trained to make an operant response for a reinforcer (typically food). After the
response is acquired, the outcome is associated with an aversive event (i.e., illness) only for the
devaluation group. In a subsequent test, if the responses of the devaluation group are reduced
compared to the control group, we infer that the animals had formed an association between the
response and the outcome during the initial training sessions.
Dickinson and Adams (1981) were the first to show that animals undergoing this
devaluation procedure, but not their paired controls, show lower rates of lever pressing
during the test. This reduction in operant responding is an indication that animals formed
R-O associations during training. In other words, they had formed an association
between what they did and what happened. An interesting twist to the finding is that
devaluation is ineffective if animals are very well trained. This suggests that extended
operant training produces habit learning that is insensitive to changes in the outcome.
This fits with our current ideas of habits; these are automatic responses to environmental
stimuli, a phenomenon that many people experience when they perform the same action
again and again. Think about how you get to class every day. If you walk or ride your
bike, you probably take the same route and could arrive at your destination before you
realize it. The expression „on automatic pilot‟ fits this behavior. If one day you plan to
stop by the bank machine on your way to class, you may be so accustomed to your usual
routine that you forget to detour to the bank machine and arrive at class… without
money.
The transition from R-O to S-R systems in operant conditioning is an example of
how cognitive processing helps animals adjust to changing environments. When animals
(including humans) initially learn a task, they must attend to the consequences of their
action so that they can modify their responses accordingly. If the environment remains
relatively stable and responses consistently produce the same outcome, habitual
responding takes over. In this way, animals can cope with both predicted and
unpredicted events in their environment.
8.3 Rescorla-Wagner Model
Rather than focus on defining which associations are acquired during classical or
operant conditioning, a number of researchers began to ask how animals code the logical
relationship between events in their environment. One the most influential of these
theories is the Rescorla-Wagner model (1974), formulated to explain classical
conditioning and the phenomenon of blocking. Recall that, in a blocking experiment, the
stimulus added during compound conditioning trials (CS2) does not elicit a CR because it
does not provide any new information about the arrival of the US (see Section 6.1 above).
CS1 already predicts the US, so adding CS2 is redundant. As Kamin noted in his original
experiment, if the US presentation is not surprising, then no new learning will take place.
Rescorla and Wagner formalized this principle in a mathematical equation that presents
Page 29 of 45
classical conditioning as an adjustment between expectations and occurrences of the US:
learning ceases when there is no discrepancy between the two.
In order to understand the details of the Rescorla-Wagner formula, it is important
to be familiar with some basic concepts of the model. First, this is an acquisition-based
model in that it describes changes in conditioning on a trial-by-trial basis. If the strength
of the US is larger than expected on any given trial, all CSs on that trial will be excitatory
meaning that they will increase the CR. The larger the difference between the expected
and actual strength of the US, the larger will be the increase in conditioning. On the
other hand, if the US strength is less than expected (think about extinction), all CSs
associated with the US will be inhibitory. This means that the CS will inhibit or reduce
the CR. Second, increases in the salience of the CS will increase conditioning. This
principle explains overshadowing. When stimuli are presented in a compound,
conditioning to the weakest or least noticeable stimulus is minimized even if that
stimulus can elicit a CR when it is paired with the US on its own. Third, the salience of
the US defines the maximum level of conditioning that may occur; increases in US
salience increase conditioning up to this asymptotic level. Responding at this maximum
level is called a ceiling effect in that further conditioning trials do not produce any
change in behavior. Finally, the strength of the US expectancy in compound
conditioning will be equal to the combined strength of all CSs in the compound. If this
combined value is at the maximum US strength, no new conditioning will occur. This is
what happens in blocking: one CS has already acquired an associative strength that is
equal to the US strength so adding a new CS in conditioning trials does not produce any
conditioning. In other words, the conditioning strength has been used up by CS1 so
there is none left for CS2.
The formal rule specifying the change in associative strength of a CS on a single
conditioning episode is written as follows:
V = ( - SV)
V
V

the associative strength of a CS on a given trial.
the change in associative strength (V) on that trial.
a learning rate parameter determined by the salience of the CS. One can think of
this as the maximum possible associative strength of a CS: bright lights and loud
tones will have a higher  value than will dim lights and soft tones.

a learning rate parameter determined by the salience of the US. Strong shocks
and large amounts of food will have a higher  value than will weak shocks and
small amounts of food. Both  and  are constants with a range between 0 and 1.
Sometimes the two are combined into a single term denoting the combined
salience of an experimental trial.

the maximum amount of conditioning or associative strength that a US can
support. It has a positive value when the US is presented and is 0 when no US is
presented.
SV
the sum of the associative strength of all CSs present on that trial.
 - SV an error term that indicates the discrepancy between what is expected (SV) and
what is experienced (). When ( - SV) is zero, the outcome is fully predicted
and there are no changes in associative strength (V).
Page 30 of 45
According to the Rescorla-Wagner equation, the CS acquires associative strength
on each trial that is equal to the maximum associative strength of the US minus the
strength already acquired by that CS. The increase in associative strength declines with
each trial because the amount of associative power that remains decreases as  is used up.
This describes a typical learning curve in which each trial produces a smaller and
smaller change in behaviour. The model also explains extinction in that  is 0 so the
error term is negative. Thus, the associative strength of the CS is reduced on that trial
and responding decreases. As noted above, the Rescorla-Wagner model accurately
predicts blocking and overshadowing experiments as well as other classical conditioning
phenomena such as extinction and conditioned inhibition (a stimulus presented during
extinction trials reduces the strength of the CR on subsequent trials).
Figure 5.16 Negatively accelerating function of associative learning predicted by the RescorlaWagner model. The change in associative strength on each trial (V) is dependent on the salience
of the CS (), the salience of the US () and the discrepancy between what is expected and what
occurs ( - SV). Reprinted from Shettleworth (1998).
Despite its general utility and appeal, the Rescorla-Wagner model cannot explain
all aspects of classical conditioning. The most obvious are latent inhibition and sensory
preconditioning. According to the model, a stimulus should not acquire (or lose) any
associative strength when the US is not present. Thus, there is no way to account for a
CR developing to a stimulus that was never paired with a US (sensory preconditioning)
or the reduction in conditioning that follows CS pre-exposure (latent inhibition).
Subsequent modifications to the model were able to deal with these problems, although
inconsistencies between what the model predicted and what happened in the lab were still
evident. Because of this, a number of other theories developed that were purported to be
better explanations of classical conditioning. Some focused on the attention that
organisms direct towards the CS (Mackintosh,1975), whereas others focused on
comparing the likelihood that the US will occur in the presence and the absence of the CS
(Gibbon & Balsam 1981). Each theory has its strengths, but no single model was able to
account for all aspects of classical conditioning. Nonetheless, if a theory is to be judged
Page 31 of 45
on the research and discussion that it generates, the Rescorla-Wagner model is one of the
most successful in associative learning.
BOX 5.4 Neuroscience of the Rescorla-Wagner Model
An exciting development in classical conditioning is the notion that biological
mechanisms can be mapped on to the parameters of formal learning theories, including the
Rescorla-Wagner model. In one of the most prominent lines of research,Wolfram Schulz and his
colleagues have examined how dopamine neurons, which project from the midbrain to the
striatum and frontal cortex, code error prediction in classical conditioning (Hollerman & Schultz,
1998).
Figure 5.17 Dopamine response firing
patterns during task learning. In these
experiments, monkeys were implanted
with electrodes in the midbrain and the
activity of single dopamine neurons
was recorded. In the panels to the left,
each horizontal row represents one
trial with the chronological sequence
in each panel being from top to
bottom. Dots indicate cell firing and
the vertical line indicates the reward
presentation (a squirt of apple juice
into the monkey‟s mouth).
No task: When a reward is presented
with no preceding cue, dopamine
neurons fire rapidly.
Learning: Monkeys were presented
with two visual stimuli and had to
touch the correct stimulus to receive a
reward. As performance improved
(top to bottom), cell firing declined.
Familiar: When monkeys were
familiar with the task, dopamine cells
did not fire following the reward
presentation
Error during learning: When new
pictures were presented, monkeys
made errors and no reward was
presented. This expectation of reward
when none was delivered inhibited
dopamine cell firing.
Reprinted from Hollerman and Shulz
(1998).
Page 32 of 45
Schultz interprets his findings in the context of the Rescorla-Wagner model: dopamine
neurons fire when a reward is unpredicted, but not when it is predicted. This "prediction error"
message may constitute a powerful teaching signal for behavior and learning. Thus, dopamine
neurons that project from the midbrain to the forebrain code the information that is critical for
learning to anticipate significant outcomes. Using the same basic techniques, Schultz and his
colleagues, have demonstrated that neurons in other brain structures, such as the striatum,
orbitofrontal cortex, and amygdala, code the quality, quantity, and preference for rewards. To
link these events biologically, Schultz proposes that the dopamine error signal communicates
with reward perception signals to influence learning about motivationally significant stimuli.
8.4 Associative Cybernetic Model
There are far fewer formal theories of operant conditioning than of classical
conditioning for three primary reasons. First, classical conditioning is often easier to
study because animals do not need to be trained in an operant task. Second, the
experimenter controls US (reinforcer) presentations in classical, but not operant,
conditioning, making the connection between learned associations and behavioral
changes more straightforward. Third, and perhaps most important, many scientists in the
past assumed that classical and operant conditioning were mediated by the same
processes, so a theory of one should explain the other. In this respect, the RescorlaWagner model can be applied to operant conditioning if we consider that responses with
surprising outcomes should produce the greatest increments in learning.
One exception is the Associative Cybernetic model developed by Dickinson and
Balleine (1993) that applies specifically to operant conditioning (see Figure 5.18).
Cybernetic refers to the fact that an internal representation of the value assigned to an
outcome feeds back to modulate performance. The model consists of four principle
components: a habit memory system that represents S-R learning; an associative memory
system that represents R-O associations; an incentive system that connects the
representation of an outcome with a value (e.g., rewarding or punishing); and a motor
system that controls responding. The motor system can be activated directly by the habit
memory or via associative memory and its connections through the incentive system.
Thus, the Associative Cybernetic Model explains the interaction between S-R and R-O
learning systems, and describes how rewards and punishments modify responding.
Page 33 of 45
Figure 5.18 The Associative Cybernetic model of operant conditioning. The model consists of
four principle components that interactively produce changes in behavioral responding. See text
for details. Redrawn from Dickinson (1984).
Like the Rescorla-Wagner model, some components of the Associative
Cybernetic model have been applied to specific neural substrates (Balleine & Ostund,
2006). Although the details are not complete, the evidence that neural mechanisms map
onto parameters of both models adds credence to these theoretical accounts of learning.
9. NEUROSCIENCE
The fact that associative learning is observed in all vertebrate, and many
invertebrate, species suggests that this fundamental process is conserved across evolution.
Still, it is not clear whether these similarities reflect common evolutionary descent or
whether different species have converged on the same cognitive solution (i.e., forming
associations) to learn about causal and predictive relationships in their environment. If
the same biological processes mediate classical or operant conditioning in different
species, this would provide stronger support for the conservation hypothesis. The
following sections review the biological underpinnings of associative learning, beginning
with molecular mechanism, extending to cellular systems and ending with examples of
neural circuits that mediate classical and operant conditioning.
9.1 Molecular Mechanisms
In the early 1970s, Seymour Benzer (1973) developed a technique that has been
used successfully to study the molecular mechanisms of associative learning. Benzer
produced genetic mutations in the fruit fly, Drosophila, by exposing them to radiation or
Page 34 of 45
chemicals. Different behavioral changes were observed with different mutations and
some of these related to associative learning. For example, Drosophila will avoid an
odor that was previously paired with a shock (classical conditioning); mutant dunce flies
fail to learn this association even though they show no deficit in responding to these
stimuli on their own (Dudai & Quinn, 1980). Dunce flies have a mutation in the gene
that codes for the enzyme cyclic AMP phosphodiesterase, which itself breaks down the
intracellular messenger cyclic AMP (cAMP). Defects in the cAMP signaling pathway
were later identified in two other mutants, named rutabaga and amnesiac, both of which
show deficits in classical conditioning.
Research with sea slugs, Aplysia, supports the idea that the cAMP pathway is
critical for associative learning. In this model, a CS sets off action potentials in sensory
neurons that trigger the opening of calcium channels at nerve terminals. The US
produces action potentials in a different set of neurons that synapse on terminals of the
CS neurons. When neurotransmitter is released from the US neurons, it binds to the CS
neuron and activates adenylate cyclase within the cell. Adenylate cyclase generates
cAMP and, in the presence of elevated calcium, adenlyate cyclase churns out more
cAMP. Thus, if the US occurs shortly after the CS, intracellular second messengers
systems are amplified within CS neurons. This causes conformational changes in
proteins and enzymes within the cell that lead to enhanced neurotransmitter release from
the CS neuron. The consequence? A bigger behavioral response. The details of this
process are presented in Figure 5.19.
Page 35 of 45
Figure 5.19 Molecular mechanisms of classical conditioning. (a) When the US is presented by
itself, it activates the motor neuron and sensitizes the sensory neuron. This process involves 5HT release from the pre-synaptic neuron, activation of the post-synaptic neuron and an increase
in adenylate cyclase levels. (b) When the CS is presented before the US, it causes an opening of
calcium channels on the post-synaptic cell. This leads to even higher levels of adenylate cyclase
following the US presentation because calcium increases the intra-cellular production of
adenylate cyclase. Reprinted from Bear, Connors and Paradiso (2001).
You should recognize the similarities between this diagram and Figure 4.X
describing the molecular basis of long term potentiation (LTP). Like LTP, long-term
changes in classical conditioning also involve CREB-dependent transcription. It should
not be surprising that the same intracellular mechanisms are identified in LTP and
classical conditioning, as LTP is a cellular model of memory and classical conditioning is
one type of memory.
A recurrent theme in associative learning, and one that we touched on throughout
this chapter, is the question of whether classical and operant conditioning are mediated by
the same process. The issue was never resolved completely at a behavioral level but the
two appear to be dissociable at a molecular level. For example, mutations in Drosophila
that disrupt adenylate cyclase produce deficits in classical but not operant conditioning,
whereas mutations that disrupt protein kinase C (PKC) have the opposite effect (Brembs
& Plendl, 2008). PKC is an enzyme that is activated by cAMP so this step is downstream
to cAMP, as opposed to the upstream effect of adenylate cyclase. PKC also appears to be
important in Aplysia operant conditioning (Lorenzetti et al., 2008), suggesting a
conservation of this process across species (at least in invertebrates). You may wonder
how Drosophila and Aplysia perform an operant conditioning task. The details of these
paradigms are shown in Figure 5.20.
Page 36 of 45
Figure 5.20 Invertebrate operant conditioning paradigms. Top: Drosophila learn to fly towards a
particular visual stimulus to receive a heat reward. The fly is tethered and suspended inside a
drum that acts as a flight-simulator. Four pairs of vertical bars on the outside of the drum change
color when the animal flies towards them. During training, flying towards one color (e.g., blue)
turned on a heat source. Over training, flies approach the blue color more frequently, regardless
Page 37 of 45
of where the vertical lines were located. Thus, even when they had to redirect their flight path or
change directions, the flies approached the heat-associated color more frequently than the other
colors. Reprinted from Brembs and Plendl (2008)
Bottom: Aplysia learn to perform a biting response to receive brief stimulation of the esophageal
nerve. One day prior to training, animals are implanted with a stimulating electrode on the
anterior branch of the left esophageal nerve. During training, the animal moves freely in a small
aquarium and spontaneous behaviour is monitored so that the initiation of a biting response can
be noted. Stimulation is applied immediately following a bite (contingent reinforcement) or at
random intervals unrelated to the biting response (not shown). Over trials, the rate of
spontaneous biting increases, but only in animals that experienced the stimulation following the
bite. Reprinted from Baxter and Byrne (2006).
Identifying the molecular mechanisms of associative learning becomes more
difficult as the nervous systems become more complex. Nonetheless, evidence to date
supports a critical role for cAMP and CREB pathways in rodent associative learning
(Silva et al., 2005). Moreover, as noted in Chapter 4, human disorders involving
alterations in cAMP or CREB function are characterized by severe learning deficits. The
combination of these data argues strongly for conservation across species in the
molecular mechanisms that mediate associative learning.
9.2 Cellular Mechanisms
Now that you understand what happens within a neuron, we can examine how
communication between cells changes during associative learning. This process has been
worked out in great detail using classical conditioning of the gill withdrawal reflex in
Aplysia. A mild touch to an outer part of the animal‟s body, the mantle shelf, does not
initially elicit a response; when this stimulus precedes a tail shock (US), however, a CR
develops to the mantle touch (CS). A control condition is included in these studies in
which a light touch is applied to the siphon (another part of the Aplysia‟s body) that is not
explicitly paired with the US. The body touches that are and are not associated with the
US are referred to as CS+ and CS- respectively. As you may recall from Chapter 3,
sensitization of the gill withdrawal reflex occurs when cell firing increases in the
facilitating interneuron that synapse on presynaptic terminals of sensory neurons. A
similar process occurs in classical conditioning. Sensory neurons conveying CS+
information fire when the siphon is touched and the US causes facilitatory interneurons to
release serotonin (5-HT) on the presynaptic terminals of these neurons. If the two occur
in close temporal proximity, intra-cellular second messenger systems are amplified, as
described above. This increases the strength of the CS+ sensory-motor neuron connection
making it easier to elicit a CR in the future. On subsequent trials, the incoming CS+
signal produces a greater post-synaptic potential in the motor neuron, causing a gill
withdrawal response in the absence of the US (Hawkin et al., 1983). Because the two
neurons have not been active at the same time, synaptic strength is not altered in the CSsensory-motor neuron circuit. The specifics of these activity-dependent changes are
presented in Figure 5.21.
Page 38 of 45
Figure 5.21 Cellular pathways of classical conditioning in Aplysia. A shock (US) applied to the
animal‟s tail excites facilitating interneurons that synapse on presynaptic terminals of sensory
neurons. Sensory neurons connect to motor neurons of the mantle shelf that control the
withdrawal response. When a light touch (CS) is applied to the mantle shelf immediately prior to
the US, it primes the sensory neuron by making it more excitable. Thus, when the facilitating
interneurons fire, they produce a stronger response in the motor neurons. This increased firing is
restricted to the circuits in which the CS was paired with the US (CS+ but not CS-). With training,
the CS connection is strengthened so that it is capable of eliciting a response even when the US is
not presented. Reprinted from Kandel (1995).
In is unlikely that these cellular changes explain all instances of associative
learning, particularly those that occur over long intervals such as CTA learning. And like
molecular mechanisms, there may be differences in how neurons code classical and
operant conditioning at the cellular level (Baxter & Byrne, 2009). On the other hand, this
model provides a partial explanation for evolved dispositions: these may reflect changes
in existing neural connections such as the presynaptic modulation of sensory neurons
described above. In contrast, associations that are difficult to acquire probably involve a
more complicated remapping of neural circuitry, including the formation of new synaptic
connections. In sum, the circuit and mechanisms that underlie classical conditioning of
the gill withdrawal reflex provide a compelling story and a concrete departure for
examining the neuroscience of associative learning.
9.3 Neural Circuits
In the gill withdrawal model described above, information about the CS and the
US come together at the synapse linking facilitatory interneurons with sensory neurons.
The same process mediates classical conditioning in mammals although the brain region
where the CS and US signals converge will vary depending on the sensory properties of
the stimuli, as well as the response that is used to measure conditioning. For example, an
auditory CS will be transmitted through a different circuit than is a visual or tactile CS
and a leg flexion response will be mediated through a different output pathway than is a
salivary response.
Page 39 of 45
Of all the neural circuits underlying classical conditioning, the best characterized
is that which mediates the conditioned eyeblink response. All animals, including
humans, will blink when an air puff hits the eye. This reflexive response is commonly
studied in rabbits because the rate of spontaneous eyeblinking is low in these animals.
Thus, if animals blink in response to a CS that was previously paired with an air puff, it is
likely due to the conditioning process. Moreover, the stimulus parameters that lead to
effective conditioning, and the motor output pathway for the eyeblink response, are
clearly described in these animals.
The majority of this work was conducted by Richard Thompson and his
colleagues who determined that conditioned eyeblink responses in rabbits are mediated
within the cerebellum (Thompson &Krupa, 1994). To begin, an air puff to the eye
produces a rapid blinking response through a reflex circuit that includes the trigeminal
nucleus, reticular formation and cranial motor nuclei. Like all reflex circuits, parallel
signals conveying sensory information are sent to other brain regions. In the case of the
eyeblink response, US information is sent to the inferior olive, interpositus nucleus and
cerebellar cortex. The latter two sites send signals to the interpositus nucleus then to the
red nucleus, which itself synapses onto cranial motor nerves that produce an eyeblink
response. The critical question is how the CS accesses this US-UR circuit in order to
produce a CR. When the CS is a tone, auditory information enters the CNS via the
auditory nucleus, transmits this information to the pontine nuclei and then to the
interpositus nucleus where it meets the flow of information from the US. Like the US
signal, a parallel CS signal is sent from the pontine nucleus to the cerebellar cortex.
From the interpositus nucleus, a response is generated through the red nucleus and cranial
motor nuclei. A schematic of these details is presented in Figure 5.22.
Page 40 of 45
Figure 5.22 Neural circuitry of the conditioned eye blink response in the rabbit. Information
about an auditory CS enters via the ventral cochlear and converges with signals transmitted from
the airpuff US at the level of the interpositus nucleus. Eye blink responses, both conditioned and
unconditioned, are controlled by the accessory abducens nucleus (motor nuclei).
Similar diagrams have been drawn for the neural circuitry of other classicallyconditioned responses including CTA, conditioned fear, and conditioned approach
responses. Details of these circuits may be found in many neuroscience textbooks. The
similar feature in all of these models is that the emergence of a CR coincides with a CNS
change. The most likely change, and the one most commonly identified in these systems,
is altered synaptic plasticity in a circuit that connects CS and US signals.
The same principle holds true in operant conditioning: behavioral changes are
reflected as alterations in neural connections. Interestingly, the neural systems that
control R-O and S-R associations in operant conditioning appear to be mediated by
distinct brain regions. S-R responding or habit learning is often equated with procedural
learning discussed in Chapter 4, a processes that is dependent on the dorsal striatum. In
contrast, R-O contingencies in operant responding are mediated though a network of
brain regions that begins with signals generated in the medial prefrontal cortex (Tanaka et
al., 2008). This region computes response contingencies (i.e., R-O associations) and
then sends this information to the orbitofrontal cortex. The orbitofrontal cortex codes the
motivational significance of reinforcers (Rolls, 2004) so it is likely that associations
between the outcome and its value are formed in this brain region. Signals from the
orbitofrontal cortex are transmitted to the dorsal striatum, which controls behavioral
responses. Like classical conditioning the details of the circuit will vary depending on
the response to be emitted and the stimuli that precede it. This hypothesized neural
circuit may not mediate all instances of operant conditioning, but it provides another
direction for studying the biological basis of associative learning.
10. CHAPTER SUMMARY
Organisms ranging from insects to humans are capable of forming associations
about predictive relationships in their environment. The existence of this common trait
across a variety of species suggests that animals are designed to detect and store
information about causal relationships that affect their survival. The two most-studied
forms of associative learning are classical and operant conditioning. Classical
conditioning represents predictive associations between two stimuli, whereas operant
conditioning represents relationships between a response and its consequences.
These two associative structures are mediated through dissociable brain structures
(Ostlund & Balleine, 2007), providing concrete evidence for a distinction between the
two.
Page 41 of 45
DEFINITIONS
appetitive stimuli: stimuli that an organism will work to obtain such as food, water, sex,
drugs ,etc.
aversive stimuli: stimuli that an organism will work to avoid such as those that produce
nausea, fear, pain, etc.
unconditioned stimulus (US): in classical conditioning, stimuli that have motivational
significance prior to conditioning.
unconditioned response (UR): in classical conditioning, responses that are elicited prior
to conditioning.
conditioned stimulus (CS): in classical conditioning, stimuli that acquire motivational
significance through pairing with the US.
conditioned response (CR): in classical conditioning, responses that are elicited by the
CS following conditioning.
operant conditioning: the presentation of an outcome (positive or negative) depends on
an organism‟s response.
positive reinforcement: a positive relationship between a response and an appetitive
stimulus; presentation of the reinforcer increases responding.
punishment: a negative relationship between a response and an appetitive stimulus;
presentation of the reinforcer decreases responding.
negative reinforcement: a positive relationship between a responses and an aversive
stimulus; removal of the reinforcer increases responding.
omission: a negative relationship between a response and an aversive stimulus; removal
of the reinforcer decreases responding.
suppression ratio: the dependent measure in a classical conditioning test of suppression
of lever pressing; calculated as (lever presses during the CS) / (lever presses during the
CS plus lever presses during an equal period of time preceding the CS).
conditioned taste aversion (CTA): avoidance of a flavor that was previously associated
with nausea.
conditioned avoidance: an operant paradigm in which animals learn to avoid a stimulus
associative with an aversive event.
Page 42 of 45
reinforcement schedule: in operant conditioning, the relationship between responding
and the rate of reinforcement delivery.
fixed ratio (FR): a reinforcement schedule in which a set number of responses produces
the reinforcer.
variable ratio (VR): a reinforcement schedule in which an average number of responses
produces the reinforcer.
fixed interval (FI): a reinforcement schedule in which reinforcement is delivered
following the first responses that occurs after a set period of time.
variable interval (VI): a reinforcement schedule in which reinforcement is delivered
following the first responses that occurs after an average time interval has elapsed.
equipotentiality: the idea that associations between different stimuli, responses and
reinforcers could be formed with equal ease.
evolved dispositions: the relative ease with which animals acquire certain associations,
based on their evolutionary history.
extinction: removal of a US that leads to a reduction in responding.
blocking: a phenomenon in which an association between one stimulus and a US disrupts
subsequent conditioning to a second stimulus when the two are presented in compound
conditioning trials.
latent inhibition: a phenomenon in which prior exposure to a stimulus blocks or retards
subsequent conditioning to this stimulus.
overshadowing: a phenomenon in which one stimulus acquires stronger conditioning
than a second stimulus when the two are presented in compound conditioning trials.
spontaneous recovery: the reappearance of a CR following extinction.
disinhibition: the recovery of a CR following extinction when a novel stimulus is
presented.
response renewal (reacquisition): recovery of a CR following extinction when
extinction is conducted in a novel environment.
sensory preconditioning: a 3-stage experiment: 1. Two neutral stimuli (A and B) are
presented together; 2. One stimulus (A) is paired with a US; 3. The other stimulus (B) is
tested for conditioned responses.
Page 43 of 45
devaluation: a 3-stage experiment: 1. CS-US pairings; 2. Devaluation of the US through
association with an aversive stimulus; 3. Testing conditioning properties of CS.
habit learning (S-R learning): responses that are elicited automatically by
environmental stimuli and are relatively insensitive to changes in the value of the
reinforcer.
learning curve: changes in behaviour that occur across conditioning trials (either
classical or operant), characterized by smaller and smaller increments as the trials
progress.
FURTHER READING
Cuny, H. (1962). Ivan Pavlov: The man and his theories. New York: Fawcett.
Dickinson, A. (1980). Contemporary animal learning theory. Cambridge: Cambridge
University Press.
Hollis, K.L. (1997). Contemporary research on Pavlovian conditioning: A „new‟
functional analysis. American Psychologist, 52, 956-965.
MacKintosh, N.J. (Ed.). (1994). Animal learning and cognition. San Diego, CA:
Academic Press.
MacPhail, E.M. (1996). Cognitive function in mammals: the evolutionary perspective.
Cognitive Brain Research, 3, 279-290.
Page 44 of 45
RESEARCHER PROFILE: Dr. Karen Hollis
Karen Hollis is a professor in the Interdisciplinary
Program in Neuroscience & Behavior at Mount Holyoke
College South Hadley. She was trained as a
psychologist but, very early in her career, incorporated a
evolutionary perspective into her research. Her work
has focused on examining the ways in which animals
learn to predict biologically-relevant events such as
food, aggressors, potential mates or predators. She
examines natural behaviors but incorporates
experimental manipulations in order to understand how
animals optimize their interactions with biologicalrelevant events. Hollis‟ work has helped to identify how
classical conditioning may increase the survival and
reproduction of fish. "The point of my research is to see
how what psychologists say about learning can be
brought to bear on what zoologists study in the field,"
says Hollis.