Download conditioned response

Document related concepts

Behaviorism wikipedia , lookup

Behaviour therapy wikipedia , lookup

Psychological behaviorism wikipedia , lookup

Eyeblink conditioning wikipedia , lookup

Psychophysics wikipedia , lookup

Classical conditioning wikipedia , lookup

Operant conditioning wikipedia , lookup

Transcript
Revision II
DR DINESH RAMOO
What is Learning?
 Almost all human behaviour is learned. Imagine if you suddenly lost all you
had ever learned. What could you do?
 You would be unable to read, write, or speak. You couldn’t feed yourself, find
your way home, drive a car, play the bassoon, or “party.” Needless to say, you
would be totally incapacitated. (Dull, too!)
 Learning is a relatively permanent change in behaviour due to experience
(Powell, Symbaluk, & Honey, 2009). Notice that this definition excludes both
temporary changes and more permanent changes caused by motivation,
fatigue, maturation, disease, injury, or drugs.
 Each of these can alter behaviour, but none qualifies as learning.
Definitions of Learning
 A change in behaviour as a result of experience or practice.
 The acquisition of knowledge.
 Knowledge gained through study.
 To gain knowledge of, or skill in, something through study, teaching, instruction or
experience.
 The process of gaining knowledge.
 A process by which behaviour is changed, shaped or controlled.
 The individual process of constructing understanding based on experience from a wide
range of sources.
Behaviourism
 Most psychologists who study animal learning and behaviour seek simple
explanations, such as trial-and error learning, that do not require us to assume
complicated mental processes.
 The behaviourists, who have dominated the study of animal learning, insist that
psychologists should study only observable, measurable behaviours, not mental
processes.
 Behaviourists seek the simplest possible explanation for any behaviour and resist
interpretations in terms of understanding or insight.
 At least, they insist, we should exhaust attempts at simple explanations before we
adopt more complex ones.
Behaviourists
 The term behaviourist applies to theorists and researchers with quite a
range of views (O’Donohue & Kitchener, 1999).
 Two major categories are:
 Methodological behaviourists
 radical behaviourists.
Classical Conditioning
Pavlov’s Experiment

Pavlov used an experimental setup like the one in the figure (Goodwin, 1991).

First, he selected dogs with a moderate degree of arousal. (Highly excitable
dogs would not hold still long enough, and highly inhibited dogs would fall
asleep.)

Then he attached a tube to one of the salivary ducts in the dog’s mouth to
measure salivation. He could have measured stomach secretions, but
measuring salivation was easier.

Pavlov found that, whenever he gave a dog food, the dog salivated. The food
and salivation connection was automatic, requiring no training. Pavlov called
food the unconditioned stimulus, and he called salivation the unconditioned
response.

If a particular stimulus consistently, automatically elicits a particular
response, we call that stimulus the unconditioned stimulus (UCS), and the
response to it is the unconditioned response (UCR).
 Next Pavlov introduced a new stimulus, such as a metronome. Upon hearing the
metronome, the dog lifted its ears and looked around but did not salivate, so the
metronome was a neutral stimulus with regard to salivation.
 Then Pavlov sounded the metronome a couple of seconds before giving food to the
dog. After a few pairings of the metronome with food, the dog began to salivate as
soon as it heard the metronome (Pavlov, 1927/1960). We call the metronome the
conditioned stimulus (CS) because the dog’s response to it depends on the
preceding conditions—that is, the pairing of the CS with the UCS.
 The salivation that follows the metronome is the conditioned response (CR).
The conditioned response is simply whatever response the conditioned stimulus
begins to elicit as a result of the conditioning (training) procedure. At the start of
the conditioning procedure, the conditioned stimulus does not elicit a conditioned
response. After conditioning, it does.
At first
During training
After some number of repetitions
Conditioned and Unconditioned Response
 In Pavlov’s experiment the conditioned response (salivation) closely
resembled the unconditioned response (also salivation).
 However, in some cases it is quite different.
 For example, the unconditioned response to an electric shock includes
shrieking and jumping.
 The conditioned response to a stimulus paired with shock (i.e., a
warning signal for shock) is a tensing of the muscles and lack of activity
(e.g., Pezze, Bast, & Feldon, 2003).
Examples of Classical Conditioning
 Your alarm clock makes a faint clicking sound a couple of seconds
before the alarm goes off. At first the click by itself does not awaken you,
but the alarm does. After a week or so, you awaken as soon as you hear
the click.
Unconditioned
Stimulus
=
Alarm
Conditioned
Stimulus
=
Click

Unconditioned
Response
=
Awakening

Conditioned
Response
=
Awakening
Examples of Classical Conditioning
 You hear the sound of a dentist’s drill shortly before the unpleasant
experience of the drill on your teeth. From then on the sound of a
dentist’s drill arouses anxiety.
Unconditioned
Stimulus
=
Drilling
Conditioned
Stimulus
=
Sound of the drill

Unconditioned
Response
=
Tension

Conditioned
Response
=
Tension
Examples of Classical Conditioning
 A nursing mother responds to her baby’s cries by putting the baby to her
breast, stimulating the flow of milk. After a few days of repetitions, the
sound of the baby’s cry is enough to start the milk flowing.
Unconditioned
Stimulus
=
Baby sucking
Conditioned
Stimulus
=
Baby’s cry

Unconditioned
Response
=
Milk flow

Conditioned
Response
=
Milk flow
Examples of Classical Conditioning
 Note the usefulness of classical conditioning in each case: It prepares an individual
for likely events. In some cases, however, the effects can be unwelcome. For
example, many cancer patients who have had repeated chemotherapy or radiation
become nauseated when they approach or even imagine the building where they
received treatment (Dadds, Bovbjerg, Redd, & Cutmore, 1997).
Unconditioned
Stimulus
Conditioned
Stimulus
=
Chemotherapy or
radiation
=
Approaching the
building

Unconditioned
Response
=
Nausea

Conditioned
Response
=
Nausea
Extinction
 Extinction is not the same as forgetting. Both weaken a
learned response, but they arise in different ways.
 You forget during a long period with no relevant experience
or practice.
 Extinction occurs as the result of a specific experience—
perceiving the conditioned
unconditioned stimulus.
stimulus
without
the
Extinction
 Extinction does not erase the original connection between the CS and the UCS. We
can regard acquisition as learning to do a response and extinction as learning to
inhibit it.
 For example, suppose you have gone through original learning in which a tone
regularly predicted a puff of air to your eyes. You learned to blink your eyes at the
tone. Then you went through an extinction process in which you heard the tone
many times but received no air puffs.
 You extinguished, so the tone no longer elicited a blink. Now, without hearing a
tone, you get another puff of air to your eyes. As a result, the next time you hear the
tone, you will blink your eyes. Extinction inhibited your response to the CS (here, the
tone), but a sudden puff of air weakens that inhibition (Bouton, 1994).
Spontaneous Recovery
 Suppose you are in a classical-conditioning experiment.
 At first you repeatedly hear a buzzer sound (CS) that precedes a puff of air to your
eyes (UCS).
 Then the buzzer stops predicting an air puff. After a few trials, your response to the
buzzer extinguishes.
 Now, suppose you sit there for a long time with nothing happening and then
suddenly you hear another buzzer sound.
 What will you do? Chances are, you will blink your eyes at least slightly.
Spontaneous recovery is this temporary return of an extinguished response
after a delay. Spontaneous recovery requires no additional CS–UCS pairings.
Operant Conditioning
Reinforcement
 A reinforcement is an event that
increases the future probability of
the
most
recent
response.
Thorndike said that it “stamps in,”
or strengthens, the response.
 The next time the cat is in the
puzzle box, it has a slightly higher
probability
of
the
effective
response; after each succeeding
reinforcement, the probability goes
up another notch
According to Skinner, reinforcement occurs when a response is followed by
rewarding consequences and the organism’s tendency to make the response
increases. The two examples diagrammed here illustrate the basic premise of
operant conditioning—that voluntary behaviour is controlled by its
consequences. These examples involve positive reinforcement (for a
comparison of positive and negative reinforcement
Punishment
 In contrast to a reinforcer, which increases the probability of a
response, a punishment decreases the probability of a response.
 A reinforcer can be either the presentation of something (e.g., food) or
the removal of something (e.g., pain).
 A punishment can be either the presentation of something (e.g., pain) or
the removal of something (e.g., food).
Reinforcement and Punishment
 What constitutes reinforcement? From a practical standpoint, a reinforcer is an event that follows a
response and increases the later probability or frequency of that response.
 However, from a theoretical standpoint, we would like to have some way of predicting what would be a
reinforcer and what would not. We might guess that reinforcers are biologically useful to the individual, but
in fact many are not.
 For example, saccharin, a sweet but biologically useless chemical, can be a reinforcer.
 For many people alcohol and tobacco are stronger reinforcers than vitamin rich vegetables. So biological
usefulness doesn’t define reinforcement.
 In his law of effect, Thorndike described reinforcers as events that brought “satisfaction to the animal.” That
definition won’t work either. How could you know what brings a rat or a cat satisfaction? Furthermore,
people will work hard for a pay check, a decent grade in a course, and other outcomes that often don’t
produce evidence of pleasure (Berridge & Robinson, 1995).
Classical and Operant Conditioning
 In general the two kinds of conditioning also differ in the behaviours
they affect.
 Classical conditioning applies primarily to visceral responses (i.e.,
responses of the internal organs), such as salivation and digestion,
whereas operant conditioning applies primarily to skeletal responses
(i.e., movements of leg muscles, arm muscles, etc.).
 However, this distinction sometimes breaks down. For example, if a
tone consistently precedes an electric shock (a classical-conditioning
procedure), the tone will make the animal freeze in position (a skeletal
response) as well as increase its heart rate (a visceral response).
Categories of Reinforcement and Punishment
Extinction
 In
operant conditioning extinction
occurs if responses stop producing
reinforcements.
 For example, you were once in the habit
of asking your roommate to join you for
supper.
 The last five times you asked, your
roommate said no, so you stop asking.
 In classical conditioning extinction is
achieved by presenting the CS without
the UCS; in operant conditioning the
procedure
is
response
without
reinforcement.
Generalization
 Someone who receives reinforcement for a response in the presence of
one stimulus will probably make the same response in the presence of a
similar stimulus.
 The more similar a new stimulus is to the original reinforced stimulus,
the more likely the same response. This phenomenon is known as
stimulus generalization.
 For example, you might reach for the turn signal of a rented car in the
same place you would find it in your own car.
Examples of Generalisation
 Many harmless animals have evolved an appearance that resembles a
poisonous animal, because any predator that learns to avoid the
poisonous animal generalizes its learning and avoids the harmless
animal also. Eastern Ecuador has two similar poisonous frog species
and one harmless species that mimics their appearance.
Discrimination
 If reinforcement occurs for responding to one stimulus and not another,
the result is a discrimination between them, yielding a response to
one stimulus and not the other.
 For example, you smile and greet someone you think you know, but
then you realize it is someone else.
 After several such experiences, you learn to recognize the difference
between the two people.
Discriminative Stimuli
 A stimulus that indicates which response is appropriate or inappropriate is
called a discriminative stimulus.
 A great deal of our behaviour is governed by discriminative stimuli. For
example, you learn ordinarily to be quiet in class but to talk when the
professor encourages discussion.
 You learn to drive fast on some streets and slowly on others. Throughout your
day one stimulus after another signals which behaviours will yield
reinforcement, punishment, or neither.
 The ability of a stimulus to encourage some responses and discourage others
is known as stimulus control.
Basic Processes in Classical and Operant Conditioning
Explanations of Classical Conditioning
 What
really?
is classical conditioning,
 As is often the case, the process
appeared simple at first, but later
investigation found it to be a more
complex and more interesting
phenomenon.
 Pavlov
noted that conditioning
depended on the timing between
CS and UCS
 Later studies contradicted that idea. For example, a shock (UCS) causes rats
to jump and shriek, but a conditioned stimulus paired with shock makes them
freeze in position.
 They react to the conditioned stimulus as a danger signal, not as if they felt a
shock. Also, in delay conditioning, where a delay separates the end of the CS
from the start of the UCS, the animal does not make a conditioned response
immediately after the conditioned stimulus but instead waits until almost the
end of the usual delay between the CS and the UCS.
 Again, it is not treating the CS as if it were the UCS; it is using it as a
predictor, a way to prepare for the UCS (Gallistel & Gibbon, 2000).
 It is true, as Pavlov suggested, that the longer the delay between the CS
and the UCS, the weaker the conditioning, other things being equal.
 However, just having the CS and UCS close together in time is not
enough.
 It is essential that they occur more often together than they occur apart.
That is, there must be some contingency or predictability between them.
 Consider this experiment: For rats in both Group 1 and Group 2,
every presentation of a CS is followed by a UCS, as shown in Figure
6.9. However, for Group 2, the UCS also appears at many other times,
without the CS. In other words, for this group, the UCS happens
every few seconds anyway, and it isn’t much more likely with the CS
than without it. Group 1 learns a strong response to the CS; Group 2
does not (Rescorla, 1968, 1988).
 Now consider this experiment: One group of rats receives a light (CS)
followed by shock (UCS) until they respond consistently to the light. (The
response is to freeze in place.)
 Then they get a series of trials with both a light and a tone, again followed by
shock. Do they learn a response to the tone? No. The tone always precedes the
shock, but the light already predicted the shock, and the tone adds nothing
new. The same pattern occurs with the reverse order:
 First rats learn a response to the tone and then they get light–tone
combinations before the shock. They continue responding to the tone, but not
to the light, again because the new stimulus predicted nothing that wasn’t
already predicted (Kamin, 1969).
 These results demonstrate the blocking effect: The previously established
association to one stimulus blocks the formation of an association to the
added stimulus.
 Again, it appears that conditioning depends on more than presenting two
stimuli together in time.
 Learning occurs only when one stimulus predicts another.
 Later research has found that presenting two or more stimuli at a time often
produces complex results that we would not have predicted from the results
of single-stimulus experiments (Urushihara, Stout, & Miller, 2004).
Chaining Behaviour
 Ordinarily, you don’t do just one action and then stop. You do a long sequence
of actions.
 To produce sequences of learned behaviour, psychologists use a procedure
called chaining.
 Assume you want to train an animal, perhaps a guide dog or a show horse, to
go through a sequence of actions in a particular order.
 You could chain the behaviours, reinforcing each one with the opportunity to
engage in the next one. First, the animal learns the final behaviour for a
reinforcement. Then it learns the next to last behaviour, which is reinforced
by the opportunity to perform the final behaviour. And so on.
Schedules of Reinforcement

The simplest procedure in operant conditioning is to
provide reinforcement for every correct response, a
procedure known as continuous reinforcement.

However, in the real world, unlike the laboratory,
continuous reinforcement is not common. Reinforcement
for some responses and not for others is known as
intermittent reinforcement.

We behave differently when we learn that only some of our
responses will be reinforced. Psychologists have investigated
the effects of many schedules of reinforcement, which
are rules or procedures for the delivery of reinforcement.

Four schedules for the delivery of intermittent
reinforcement are fixed ratio, fixed interval, variable ratio,
and variable interval. A ratio schedule provides
reinforcements depending on the number of responses. An
interval schedule provides reinforcements depending on the
timing of responses.
Fixed-Ratio Schedule
 A fixed-ratio schedule provides reinforcement only after a certain
(fixed) number of correct responses have been made—after every sixth
response, for example.
 We see similar behaviour among pieceworkers in a factory whose pay
depends on how many pieces they turn out or among fruit pickers who get
paid by the bushel.
 A fixed-ratio schedule tends to produce rapid and steady responding.
Researchers sometimes graph the results with a cumulative record, in
which the line is flat when the animal does not respond, and it moves up
with each response.
 For a fixed-ratio schedule, a typical result would look like the figure.
 However, if the schedule requires a large number of responses for
reinforcement, the individual pauses after each reinforced response. For
example, if you have just completed 10 calculus problems, you may pause
briefly before starting your next assignment. After completing 100
problems, you would pause even longer.
Variable-Ratio Schedule
 A variable-ratio schedule is similar to a fixed-ratio
schedule, except that reinforcement occurs after a variable
number of correct responses.
 For example, reinforcement may come after as few as one or
two responses or after a great many. Variable-ratio
schedules generate steady response rates.
 Variable-ratio schedules, or approximations of them, occur
whenever each response has about an equal probability of
success.
 For example, when you apply for a job, you might or might
not be hired. The more times you apply, the better your
chances, but you cannot predict how many applications you
need to submit before receiving a job offer.
Fixed-Interval Schedule
 A fixed-interval schedule provides reinforcement for the first
response made after a specific time interval.
 For instance, an animal might get food for only the first response it
makes after each 15-second interval.
 Then it would have to wait another 15 seconds before another
response would be effective. Animals (including humans) on such a
schedule learn to pause after each reinforcement and begin to
respond again toward the end of the time interval.
 The cumulative record would look like the figure. Checking your
mailbox is an example of behaviour on a fixed-interval schedule. If
your mail is delivered at about 3 P.M., and you are eagerly awaiting
an important package, you might begin to check around 2:30 and
continue checking every few minutes until it arrives.
Variable-Interval Schedule

With a variable-interval schedule, reinforcement is available after a
variable amount of time has elapsed.

For example, reinforcement may come for the first response after 2 minutes,
then for the first response after the next 7 seconds, then after 3 minutes 20
seconds, and so forth.

You cannot know how much time will pass before your next response is
reinforced.

Consequently, responses on a variable-interval schedule occur slowly but
steadily. Checking your e-mail is an example: A new message could appear at
any time, so you check occasionally but not constantly.

Stargazing is also reinforced on a variable-interval schedule. The
reinforcement for stargazing—finding a comet, for example—appears at
unpredictable intervals. Consequently, both professional and amateur
astronomers scan the skies regularly.
Cognitive Social Theory
Introduction
 By the 1960s, many researchers and theorists had begun to wonder whether a
psychological science could be built strictly on observable behaviours without
reference to thoughts.
 Most agreed that learning is the basis of much of human behaviour, but some
were not convinced that classical and operant conditioning could explain
everything people do.
 From behaviourist learning principles thus emerged cognitive–social
theory (sometimes called cognitive–social learning or cognitive–
behavioural theory), which incorporates concepts of conditioning but adds
two new features:


a focus on cognition and
a focus on social learning.
Learning and Cognition
 According to cognitive–social theory, the way an animal construes the
environment is as important to learning as actual environmental
contingencies.
 That is, humans and other animals are always developing mental
images of, and expectations about, the environment, and these
cognitions influence their behaviour.
Preparedness and Phobias
 According to Martin Seligman (1971) and other theorists (Öhman, 1979;
Öhman, Dimberg, & Öst, 1985), evolution has also programmed
organisms to acquire certain fears more readily than others because of a
phenomenon called preparedness.
 Preparedness involves species-specific predispositions to be
conditioned in certain ways and not others.
Learned Helplessness
 The powerful impact of expectancies on the behaviour of nonhuman animals was dramatically
demonstrated in a series of studies by Martin Seligman (1975).
 Seligman harnessed dogs so that they could not escape electric shocks. At first the dogs howled,
whimpered, and tried to escape the shocks, but eventually they gave up; they would lie on the floor
without struggle, showing physiological stress responses and behaviours resembling human
depression.
 A day later Seligman placed the dogs in a shuttle-box from which they could easily escape the shocks.
 Unlike dogs in a control condition who had not been previously exposed to inescapable shocks, the
dogs in the experimental condition made no effort to escape and generally failed to learn to do so even
when they occasionally did escape.
 The dogs had come to expect that they could not get away; they had learned to be helpless. Learned
helplessness consists of the expectancy that one cannot escape aversive events and the motivational
and learning deficits that result from this belief.
Explanatory Style
 Seligman argued that learned helplessness is central to human
depression as well. In humans, however, learned helplessness is not an
automatic outcome of uncontrollable aversive events.
 Seligman and his colleagues observed that some people have a positive,
active coping attitude in the face of failure or disappointment, whereas
others become depressed and helpless (Peterson, 2000; Peterson &
Seligman, 1984).
 They demonstrated in dozens of studies that explanatory style plays a
crucial role in whether or not people become, and remain, depressed.
Explanatory Style
 Individuals with a depressive or pessimistic explanatory style blame themselves
for the bad things that happen to them.
 In the language of helplessness theory, pessimists believe the causes of their
misfortune are internal rather than external, leading to lowered self-esteem.
 They also tend to see these causes as stable (unlikely to change) and global (broad,
general, and widespread in their impact).
 When a person with a pessimistic style does poorly on a biology exam, he may blame
it on his own stupidity—an explanation that is internal, stable, and global.
 Most people, in contrast, would offer themselves explanations that permit hope and
encourage further effort, such as “I didn’t study hard enough.”
Observational Learning
 Albert Bandura, Dorothea Ross, and Sheila Ross
(1963) studied the role of imitation for learning
aggressive behavior.
 They asked two groups of children to watch
films in which an adult or a cartoon character
violently attacked an inflated “Bobo” doll.
 Another group watched a different film. They
then left the children in a room with a Bobo doll.
 Only the children who had watched films with
attacks on the doll attacked the doll themselves,
using many of the same movements they had
just seen.
 The clear implication is that children copy the
aggressive behavior they have seen in others.
Basic Processes
 Bandura has identified four key processes that are crucial in observational learning. The first two—
attention and retention—highlight the importance of cognition in this type of learning.
 Attention. To learn through observation, you must pay attention to another person’s behaviour and its
consequences.
 Retention. You may not have occasion to use an observed response for weeks, months, or even years.
Thus, you must store a mental representation of what you have witnessed in your memory.
 Reproduction. Enacting a modelled response depends on your ability to reproduce the response by
converting your stored mental images into overt behaviour. This step may not be easy for some
responses. For example, most people cannot execute a breath-taking windmill dunk after watching
Derrick Rose do it in a basketball game.
 Motivation. Finally, you are unlikely to reproduce an observed response unless you are motivated to
do so. Your motivation depends on whether you encounter a situation in which you believe that the
response is likely to pay off for you.
1. Contagion
 contagion, a phenomenon in which a response by one individual tends
to elicit the same response in others, might be mistaken for
observational learning.
 For example, perhaps you’ve noticed that when one person yawns,
others tend to yawn also. Have they learned to yawn by observing
others? Obviously not.
2. Classical Conditioning
 one might mistake classical conditioning for observational learning.
 Suppose Michelle is in the garage with her mother when a mouse scurries by.
 Her mother screams and jumps away. This might cause Michelle to be afraid
of mice, but not necessarily because she learned her mother’s fear.
 It could be that her mother’s scream scared Michelle (just like the loud noise
scared little Albert), and because the mouse was also present (just like the rat
for little Albert) Michelle might learn to fear mice.
3. Stimulus enhancement
 behaviors that are due to stimulus enhancement might be mistaken for observational learning.
 Stimulus enhancement, as the name implies, occurs when attention is directed to a stimulus, such
as when an illusionist says, “Keep your eyes on the red ball.” How could this be mistaken for
observational learning?
 Well, suppose one night we discover that a raccoon has learned how to open a garbage can, and, much
to our dismay, the following night many raccoons have opened many garbage cans.
 We might assume that they learned how to do this by watching the first raccoon, but that might not be
what happened.
 It could be that the behavior of the first raccoon caused the other raccoons to realize that garbage cans
might hold some tasty treasures—pizza crusts, fried chicken skins, and half-eaten jellyrolls.
 This might have emboldened the other raccoons to try to open garbage cans, and after a bit of effort
they might have figured out how to do it. Thus, the first raccoon might not have taught them how to
open garbage cans but might simply have directed their attention to the garbage cans.
Self-Reinforcement and Self-Punishment
 We learn by observing others who are doing what we would like to do. If our
sense of self-efficacy is strong enough, we try to imitate their behavior.
 But actually succeeding often requires prolonged efforts. People typically set
a goal for themselves and monitor their progress toward that goal.
 They provide reinforcement or punishment for themselves, just as if they
were training someone else.
 They say to themselves, “If I finish this math assignment on time, I’ll treat
myself to a movie and a new magazine. If I don’t finish on time, I’ll make
myself clean the stove and the sink.” (Nice threat, but people usually forgive
themselves without imposing the punishment.)
Questions?