Download Conditioning and Learning

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Theory of reasoned action wikipedia , lookup

Neuroeconomics wikipedia , lookup

Applied behavior analysis wikipedia , lookup

Learning theory (education) wikipedia , lookup

Adherence management coaching wikipedia , lookup

Verbal Behavior wikipedia , lookup

Insufficient justification wikipedia , lookup

Behavior analysis of child development wikipedia , lookup

Psychophysics wikipedia , lookup

Psychological behaviorism wikipedia , lookup

Behaviorism wikipedia , lookup

Classical conditioning wikipedia , lookup

Operant conditioning wikipedia , lookup

Transcript
Conditioning and Learning
Learning: Some Key Terms
• Learning: Relatively permanent change in
behavior due to experience
– Does not include temporary changes due to disease,
injury, maturation, injury or drugs since these do
NOT qualify as learning
• Reinforcement: Any event that increases the
probability that a response will recur
• Response: Any identifiable behavior
– Internal: Faster heartbeat
– Observable: Eating, scratching
• Antecedents: Events that precede a response
• Consequences: Effects that follow a response
Classical Conditioning and Ivan Pavlov
• Russian physiologist who initially was
studying digestion
• Used dogs to study salivation when dogs were
presented with meat powder
• Also known as Pavlovian or Respondent
Conditioning
The classical conditioning procedure.
An apparatus for Pavlovian conditioning. A tube carries saliva from the dog’s mouth to a lever
that activates a recording device (far left). During conditioning, various stimuli can be paired
with a dish of food placed in front of the dog. The device pictured here is more elaborate than
the one Pavlov used in his early experiments.
Principles of Classical Conditioning
• Acquisition: Training period when a
response is strengthened
• Expectancy: Anticipation about future
events or relationships
• Extinction: Weakening of a conditioned
response through removal of reinforcement
• Spontaneous Recovery: Reappearance of a
learned response following apparent
extinction
Principles of Classical Conditioning Continued
• Stimulus Generalization: A tendency to
respond to stimuli that are similar, but not
identical , to a conditioned stimulus. E.g.
responding to a buzzer, or a hammer
banging, when the conditioning stimulus
was a bell
• Stimulus Discrimination: The ability to
respond differently to various stimuli.
– E.g. Rudy will respond differently to various
bells (alarms, school, timer)
Generalization gradients. In a study of stimulus generalization, an organism is typically
conditioned to respond to a specific CS, such as a 1200 hertz tone, and then tested with similar
stimuli, such as other tones between 400 and 2000 hertz. Graphs of the organisms’ responding
are called generalization gradients. The graphs normally show, as depicted here, that
generalization declines as the similarity between the original CS and the new stimuli decreases.
When an organism gradually learns to discriminate between a CS and similar stimuli, the
generalization gradient tends to narrow around the original CS.
Higher order conditioning takes
place when a well-learned
conditioned stimulus is used as
if it were an unconditioned
stimulus. In this example, a
child is first conditioned to
salivate to the sound of a bell.
In time, the bell will elicit
salivation. At that point, you
could clap your hands and then
ring the bell. Soon, after
repeating the procedure, the
child would learn to salivate
when you clapped your hands.
Classical Conditioning in Humans
• Phobia: Intense, irrational fear of a specific
situation or object e.g. arachnophobia (fear
of spiders; see the movie!)
• Conditioned Emotional Response: Learned
emotional reaction to a previously neutral
stimulus
• Desensitization: Gradually exposing phobic
people to feared stimuli while they stay
calm and relaxed
• Vicarious Classical Conditioning: When we
learn to respond emotionally to a stimulus
by observing another’s emotional reactions
In classical conditioning, a stimulus that does not produce a response is paired with a
stimulus that does elicit a response. After many such pairings, the stimulus that previously had
no effect begins to produce a response. In the example shown, a horn precedes a puff of air to
the eye. Eventually, the horn alone will produce an eye-blink. In operant conditioning, a
response that is followed by a reinforcing consequence becomes more likely to occur on future
occasions. In the example shown, a dog learns to sit up when it hears a whistle.
Operant Conditioning
• Definition: Learning based on the consequences of
responding
• Law of Effect (Thorndike): Responses that lead to
desired effects are repeated; those that lead to
undesired effects are not
• Operant Reinforcer: Any event that follows a
response and increases its likelihood of recurring
• Conditioning Chamber (Skinner Box): Apparatus
designed to study operant conditioning
• Response-Contingent Reinforcement:
Reinforcement given only when a particular
response occurs
The Skinner box. This simple device, invented by B. F. Skinner, allows careful study of operant
conditioning. When the rat presses the bar, a pellet of food or a drop of water is automatically
released. (A photograph of a Skinner box appears in Chapter 2.)
Timing of Reinforcement
• Operant reinforcement most effective when
given immediately after a correct response.
Effectiveness of reinforcement is inversely
related to time elapsed after correct
response occurs
• Superstitious Behavior: Behavior that is
repeated since it seems to produce
reinforcement, even though it is not
necessary
• Shaping: Gradually, in a step-by-step
fashion (successive approximations)
The effect of delay of reinforcement. Notice how rapidly the learning score drops when reward
is delayed. Animals learning to press a bar in a Skinner box showed no signs of learning if
food reward followed a bar press by more than 100 seconds. (Perin, 1943.)
Consequences: Reinforcement and Punishment
• Increasing a response:
– Positive reinforcement = response followed by
rewarding
stimulus
– Negative reinforcement = response followed by
removal of an aversive stimulus
• Escape learning
• Avoidance learning
• Decreasing a response:
– Punishment
• Problems with punishment
The effect of punishment on extinction. Immediately after punishment, the rate of bar pressing
is suppressed, but by the end of the second day, the effects of punishment have disappeared.
(After B. F. Skinner, The Behavior of Organisms. © 1938. D. Appleton-Century Co., Inc.
Reprinted by permission of Prentice-Hall, Inc.)
Types of reinforcement and punishment. The impact of an event depends on whether it is
presented or removed after a response is made. Each square defines one possibility: Arrows
pointing upward indicate that responding is increased; downward-pointing arrows indicate that
responding is decreased. (Adapted from Kazdin, 1975.)
Positive reinforcement versus negative reinforcement. In positive reinforcement, a response
leads to the presentation of a rewarding stimulus. In negative reinforcement, a response leads
to the removal of an aversive stimulus. Both types of reinforcement involve favorable
consequences and both have the same effect on behavior: The organism’s tendency to emit
the reinforced response is strengthened.
Comparison of negative reinforcement and punishment. Although punishment can occur when
a response leads to the removal of a rewarding stimulus, it more typically involves the
presentation of an aversive stimulus. Students often confuse punishment with negative
reinforcement because they associate both with aversive stimuli. However, as this diagram
shows, punishment and negative reinforcement represent opposite procedures that have
opposite effects on behavior.
Types of Reinforcers
• Primary Reinforcer: Non-learned; satisfy
biological needs. Food, water, sex
• Secondary Reinforcer: Learned reinforcer;
money, grades, approval
• Token Reinforcer: Tangible secondary
reinforcer e.g. money, gold stars, poker or
casino chips
Reinforcement in a token
economy. This graph shows the
effects of using tokens to reward
socially desirable behavior in a
mental hospital ward. Desirable
behavior was defined as cleaning,
bed making, attending therapy
sessions, and so forth. Tokens
earned could be exchanged for
basic amenities such as meals,
snacks, coffee, game-room
privileges, or weekend passes.
The graph shows more than 24
hours per day because it
represents the total number of
hours of desirable behavior
performed by all patients in the
ward. (Adapted from Ayllon &
Azrin, 1965.)
Schedules of Reinforcement
• Continuous reinforcement
• Intermittent (partial)
reinforcement
• Ratio schedules
– Fixed
– Variable
• Interval schedules
– Fixed
– Variable
Schedules of reinforcement and patterns of response. Each type of reinforcement schedule
tends to generate a characteristic pattern of responding. In general, ratio schedules tend to
produce more rapid responding than interval schedules (note the steep slopes of the FR and VR
curves). In comparison to fixed schedules, variable schedules tend to yield steadier responding
(note the smoother lines for the VR and VI schedules on the right) and greater resistance to
extinction.
Punishment
• Timing, consistency and intensity are keys
• Severe Punishment: Intense punishment,
capable of suppressing a response for a long
period
• Mild Punishment: Weak punishment;
usually only temporarily slows responses
Punishment Concepts
• Aversive Stimulus: Stimulus that is painful
or uncomfortable e.g. a shock
• Escape Learning: Learning to make a
response in order to end an aversive
stimulus
• Avoidance Learning: Learning to make a
response to avoid, postpone or prevent
discomfort e.g. not going to a doctor or
dentist
• May also increase aggression
Escape and avoidance
learning. (a) Escape and
avoidance learning are often
studied with a shuttle box like that
shown here. Warning signals,
shock, and the animal’s ability to
flee from one compartment to
another can be controlled by the
experimenter. (b) According to
Mowrer’s two-process theory,
avoidance begins because
classical conditioning creates a
conditioned fear that is elicited by
the warning signal (panel 1).
Avoidance continues because it is
maintained by operant
conditioning (panel 2).
Specifically, the avoidance
response is strengthened through
negative reinforcement, since it
leads to removal of the
conditioned fear.
Cognitive Learning
• Higher level learning involving thinking,
knowing, understanding and anticipation
• Latent Learning: Occurs without obvious
reinforcement and is not demonstrated until
reinforcement is provided
• Rote Learning: Takes place mechanically,
through repetition and memorization, or by
learning a set of rules
• Discovery Learning: Based on insight and
understanding
Latent learning. (a) The maze used by Tolman and Honzik to demonstrate latent learning by
rats. (b) Results of the experiment. Notice the rapid improvement in performance that occurred
when food was made available to the previously unreinforced animals. This indicates that
learning had occurred, but that it remained hidden or unexpressed. (Adapted from Tolman &
Honzik, 1930.)
Modeling or Observational Learning (Albert Bandura)
• Occurs by watching and imitating actions of another
person, or by noting consequences of a person’s actions
– Occurs before direct practice is allowed
• Steps to Successful Modeling
– Pay attention to model
– Remember what was done
– Must be able to reproduce modeled behavior
– If successful or behavior is rewarded, behavior more
likely to recur
– Bandura created modeling theory with classic Bo-Bo
Doll experiments
• Bo-Bo: Inflatable clown
This graph shows the average number of aggressive acts per minute before and after
television broadcasts were introduced into a Canadian town. The increase in aggression after
television watching began was significant. Two other towns that already had television were
used for comparison. Neither showed significant increases in aggression during the same time
period. (Data compiled from Joy et al., 1986.)
Self-Managed Behavior
• Premack Principle: Any high frequency
response used to reinforce a low frequency
response
– E.g. no GameBoy until you finish your
homework
• Self-Recording: Self-management based on
keeping records of response frequencies
• Behavioral Contract: Formal agreement
stating behaviors to be changed and
consequences that apply; written contract
Biology Influencing Behavior
• Fixed Action Pattern (FAP): Instinctual
chain of movements found in almost all
members of a species
• Innate Behavior: Inborn, unlearned behavior
e.g. breathing, reflexes
• Species-Specific Behavior: Behavior
patterns that occur with little variation in
almost all members of a species
• Species-Typical Behavior: Behavior
patterns typical of a species but NOT
automatic
Changes in Our Understanding of Conditioning
• Biological Constraints on
Conditioning
– Instinctive Drift
– Conditioned Taste Aversion
– Preparedness and Phobias
• Cognitive Influences on
Conditioning
– Signal relations
– Response-outcome relations
Conditioned taste aversion. Taste aversions can be established through classical conditioning,
as in the “sauce béarnaise syndrome.” However, as the text explains, taste aversions can be
acquired in ways that seem to violate basic principles of classical conditioning.
Garcia and Koelling’s research on conditioned taste aversion. In a landmark series of studies,
Garcia and Koelling (1966) demonstrated that some stimulus-response associations are much
easier to condition than others. (a) Their procedure allowed them to pair a taste stimulus
(saccharin-flavored water) with visual and auditory stimuli (a bright light and noisy buzzer),
and/or pain-inducing shock or nausea-inducing radiation. (b) They found that taste-nausea
associations were acquired easily, as were associations between auditory-visual stimuli and
pain, whereas other associations were difficult to acquire. As your text discusses, they
explained their findings in terms of evolutionary considerations.