Download Operant Conditioning

Document related concepts

Conservation psychology wikipedia , lookup

Attitude change wikipedia , lookup

Social psychology wikipedia , lookup

Bullying and emotional intelligence wikipedia , lookup

Learning theory (education) wikipedia , lookup

Motivation wikipedia , lookup

Prosocial behavior wikipedia , lookup

Observational methods in psychology wikipedia , lookup

Behavioral modernity wikipedia , lookup

Symbolic behavior wikipedia , lookup

Abnormal psychology wikipedia , lookup

Classical conditioning wikipedia , lookup

Thin-slicing wikipedia , lookup

Transtheoretical model wikipedia , lookup

Parent management training wikipedia , lookup

Theory of planned behavior wikipedia , lookup

Attribution (psychology) wikipedia , lookup

Neuroeconomics wikipedia , lookup

Theory of reasoned action wikipedia , lookup

Applied behavior analysis wikipedia , lookup

Sociobiology wikipedia , lookup

Psychological behaviorism wikipedia , lookup

Verbal Behavior wikipedia , lookup

Descriptive psychology wikipedia , lookup

Social cognitive theory wikipedia , lookup

Insufficient justification wikipedia , lookup

Behavior analysis of child development wikipedia , lookup

Behaviorism wikipedia , lookup

Operant conditioning wikipedia , lookup

Transcript
Operant Conditioning
Classical v. Operant Conditioning
  Both classical and operant conditioning use acquisition,
extinction, spontaneous recovery, generalization, and
discrimination.
  Classical conditioning uses reflexive behavior - behavior that
occurs as an automatic response to some stimulus that come
before the behavior.
  Ask: Is the behavior something the animal does NOT control?
YES. Does the animal have a choice in how to behave? NO. Classical conditioning.
  Operant conditioning uses operant or voluntary behavior –
voluntary behavior that is shaped by consequences that come
after the behavior
  Ask: Is the behavior something the animal can control? YES.
Does the animal have a choice in how to behave? YES. Operant Conditioning.
What is Operant
Conditioning?
Operant Conditioning
• Learning where frequency of a behavior
depends on the consequence that follows
that behavior
• The frequency will increase if the
consequence is reinforcing to the subject.
• The frequency will decrease if the
consequence is not reinforcing or
punishing to the subject.
The Law of Effect
Edward L. Thorndike ( 1874–1949)
Edward Thorndike (1874-1949)
• Author of the law of effect
• Behaviors with favorable consequences
will occur more frequently.
• Behaviors with unfavorable consequences
will occur less frequently.
• Created puzzle boxes for research on cats
Thorndike’s Puzzle Box
• “Thorndike’s Puzzle Box” Video #8
from Worth’s Digital Media Archive
for Psychology. (2 min)
Thorndike’s Puzzle Box
Early Operant Conditioning
• E. L. Thorndike (1898)
• Puzzle boxes and cats
First Trial
in Box
Situation:
stimuli
inside of
puzzle box
Scratch at bars
Push at ceiling
Dig at floor
Howl
Etc.
After Many
Trials in Box
Situation:
stimuli
inside of
puzzle box
Scratch at bars
Push at ceiling
Dig at floor
Howl
Etc.
Etc.
Etc.
Press lever
Press lever
B. F. Skinner (1904–1990)
B.F. Skinner (1904-1990)
• Believed that internal factors like thoughts,
emotions, and beliefs could not be used to
explain behavior. Instead said that new
behaviors were actively chosen by the organism
• Looked at “Operants” or active behaviors that
are used on the environment to generate
consequences
• Developed the fundamental principles and
techniques of operant conditioning and devised
ways to apply them in the real world
• Designed the Skinner Box, or operant chamber
The Skinner Box
Skinner’s Air Crib:
A room fit for a…Baby!
To read more on this invention: Click Here!
Reinforcement/Punishment
• Reinforcement - Any consequence that
increases the likelihood of the behavior it
follows
– Reinforcement is ALWAYS GOOD!!!
• Punishment - Any consequence that decreases
the likelihood of the behavior it follows
• The subject determines if a consequence is
reinforcing or punishing
Types of
Reinforcement:
- Always GOOD
Positive Reinforcement
• Strengthens a response by presenting a
stimulus that you like after a response
• Anything that increases the likelihood of a
behavior by following it with a desirable
event or state
• The subject receives something they want
(added)
• Will strengthen the behavior
Positive Reinforcement
Negative Reinforcement
• Strengthens a response by reducing or removing
an aversive (disliked) stimulus
• Anything that increases the likelihood of a
behavior by following it with the removal of an
undesirable event or state
• Something the subject doesn’t like is removed
(subtracted)
• Will strengthen the behavior
• Neg. Rein. Allows you to either:
– Escape something you don’t like that is already
present (Neg. Rein. By Escape)
– Avoid something before it occurs (Neg. Rein. By
Avoidance)
Negative Reinforcement
Positive/Negative Reinforcement
BOTH ARE GOOD THINGS!!!
Punishment:
Always BAD
Types of Punishment
• An undesirable event following a
behavior
• Behavior ends a desirable event or
state
• Its effect is opposite that of
reinforcement – it decreases the
frequency of behavior
Positive Punishment
(Punishment by Application)
• Something is added to the environment you
do NOT like.
• A verbal reprimand or something painful
like a spanking (See examples on pg. 211)
Negative Punishment
(Punishment by Removal)
• Something is taken away that you DO LIKE.
• Lose a privilege. (See examples on pg. 212)
The Good Effects of Punishment
• Punishment can effectively control certain
behaviors if…
– It comes immediately after the undesired
behavior
– It is consistent and not occasional
• Especially useful if teaching a child not to
do a dangerous behavior
• Most still suggest reinforcing an
incompatible behavior rather than using
punishment
Bad Effects of Punishment
• Does not teach or promote alternative, acceptable
behavior.
• Only tells what NOT to do while reinforcement
tells what to do.
• Doesn’t prevent the undesirable behavior when
away from the punisher in a “safe setting”
• Can lead to fear of the punisher, anxiety, and
lower self-esteem
• Children who are punished physically may learn
to use aggression as a means to solve problems.
Reinforcement vs. Punishment
Reinforcing/Desirable
Stimulus
Aversive/UnDesirable
Stimulus
Positive (+)
Reinforcement
Positive (+)
Punishment
Add something you DO LIKE.
Behavior Increases
Add something you DO NOT
LIKE.
Behavior Decreases
Negative (-)
Punishment
Negative (-)
Reinforcement
TAKES AWAY something you
DO LIKE.
Behavior Decreases
TAKES AWAY something you
DO NOT LIKE.
Behavior Increases
Stimulus is presented or
added to animal’s
environment…
Stimulus is removed or
taken away from animal’s
environment…
How is Neg. Reinforcement
different from Punishment?
• Negative Reinforcement will always
increase a behavior
• Punishment will always decrease a behavior
• Negative Reinforcement is something YOU
DO to take away something bad.
• Punishment is something DONE TO YOU
that is bad and makes you stop doing a
behavior.
How is Punishment & Reinforcement being used to treat severely
autistic and/or violent children?
See CNN video clip from Anderson Cooper 360.
Do you think they should be using these conditioning methods on
these kids?
Primary Versus
Secondary
Reinforcement
Primary Reinforcement
•
•
•
•
Something that is naturally satisfying
Examples: food, warmth, water, etc.
The item is reinforcing in and of itself
If on a deserted island, these are what you’d
want!
Conditioned/Secondary
Reinforcement
• Something that a person has Learned
to value or finds rewarding because it
is paired or associated with a primary
reinforcer
• Money is a good example.
• So are grades and signs of respect &
approval.
Immediate Versus
Delayed
Reinforcement
Immediate Reinforcers
• Immediate reinforcers – behaviors that
immediately precedes the reinforcer
becomes more likely to occur
– (This true when training animals. Can’t
wait for a long time before reinforcing or
the animal won’t know what behavior
you are reinforcing)
Delayed Reinforcers
• Also called Delayed Gratification –
forgoing a small immediate reinforcement
for a greater reinforcement later.
• Humans do this with paychecks, grades.
• When do we not do this?
• Stay up late to watch TV when next day
we’re tired
• Smoke for satisfaction now when later it
will kill us
Immediate/Delayed
Reinforcement
• Immediate reinforcement is more
effective than delayed reinforcement
• Ability to delay gratification predicts
higher achievement
Discriminative Stimuli
• An environmental stimulus that triggers you to do
a certain behavior that will have a consequence.
• In the presence of a specific environmental
stimulus (discriminative stimuli) we emit a
particular behavior (the operant) which is
followed by a consequence (reinforcement or
punishment)
• Example: A ringing phone is a discriminative
stimulus that sets a particular response of picking
it up and speaking in it
Extinction
• In operant conditioning, the loss of a
conditioned behavior when
consequences no longer follow it.
• The subject no longer responds since
the reinforcement or punishment has
stopped.
Thoughts from Skinner:
• Skinner believed from the moment of birth, the environment shapes
and determines your behavior through reinforcing or punishing
consequences.
• “A person does not act upon the world, the world acts upon him.”
(Read Critical Thinking Box on pg. 214-215 for more)
• “B.F. Skinner Interview” (4 min) – Disucsses Schedules of
Reinforcemtn & Free Will
– Video #9 from Worth’s Digital Media Archive for Psychology.
Parts of Operant Conditioning
(See Chart on page 215)
Discriminative
Stimulus
Specific
environmental
stimulus
Operant Response
Voluntary behavior
Gas gage on empty
Fill car with gas
Wallet on sidewalk
Give Wallet to
Security
Consequence
Effect on Future
Behavior
Event that will make
the operant response
more or less likely to
reoccur
If reinforcement =
more likely to reoccur
Avoid running out of
gas.
Get $50 Reward
If punishment = less
likely to reoccur
Some Reinforcement
Procedures:
Shaping
Shaping
• Reinforcement of behaviors that are
more and more similar to the one you
want to occur
• Technique used to establish a new
behavior
Shaping Principles
• Skinner box - soundproof box with a bar that an animal
presses or pecks to release a food or water reward, and a
device that records these responses.
• Shaping - procedure in which rewards, such as food,
gradually guide an animal’s behavior toward a desired
behavior.
• Successive approximations - shaping method in which
you reward responses that are ever closer to the final
desired behavior and ignore all other responses.
• Shaping nonverbal animals can show what they perceive.
Train an animal to discriminate between classes of events
or objects.
– After being trained to discriminate between flowers, people,
cars, and chairs, a pigeon can usually identify in which of these
categories a new pictured object belongs
Skinner attached some horizontal stripes to the wall which he then used to gauge the
dog's responses of lifting its head higher and higher. Then, he simply set about shaping
a jumping response by flashing the strobe (and simultaneously taking a picture),
followed by giving a meat treat, each time the dog satisfied the criterion for
reinforcement. The result of this process is shown below, as it was in LOOK magazine,
in terms of the pictures taken at different points in the shaping process. Within 20
minutes, Skinner had Agnes "running up the wall"
For the second shaping demonstration, Skinner trained Agnes to press the
pedal and pop the top on the wastebasket. Again, the photographer's flash
served as the conditioned reinforcer, and each step in the process was
photographed. The results are shown below.
Schedules of
Reinforcement
Continuous reinforcement
• A schedule of reinforcement in which a
reward follows every correct response
• Learning occurs rapidly
• But the behavior will extinguish quickly
once the reinforcement stops.
– Once that reliable candy machine eats your
money twice in a row, you stop putting money
into it.
Partial Reinforcement
• A schedule of reinforcement in which a
reward follows only some correct responses
• Learning of behavior will take longer
• But will be more resistant to extinction
• Includes the following types:
– Fixed-interval and variable interval
– Fixed-ratio and variable-ratio
Fixed-Ratio Schedule
• A partial reinforcement schedule that
rewards a response only after some set
number of correct responses
• The faster the subject responds, the more
reinforcements they will receive.
• i.e. piece work: You get $5 for every 10
widgets you make.
Variable-Ratio Schedule
• A partial reinforcement schedule that rewards an
unpredictable average number of correct
responses
• High rates of responding with little pause in order
to increase chances of getting reinforcement
• This schedule is very resistant to extinction.
• Vegas Rules! Sometimes called the “gambler’s
schedule”; similar to a slot machine or fishing
Fixed-Interval Schedule
• A partial reinforcement schedule that rewards only the first
correct response after some set period of time
• Produces gradual responses at first and increases as you get
closer to the time of reinforcement
• “Procrastinator Schedule”
• Example: a known weekly quiz in a class, checking
cookies after the 10 minute baking period.
Variable-Interval Schedule
• A partial reinforcement that rewards the first
correct response after an unpredictable
amount of time
• Produces slow and steady responses
• Example: “pop” quiz in a class
• “Are we there yet?” – ask all you want,
doesn’t mean it speeds up when the
reinforcement of arriving will happen
Ask Yourself…
• Can the animal speed up its reinforcement by
doing the behavior? If YES - Ratio
– Does the number of times the animal does the behavior
vary for reinforcement? Variable
– Does the animal do the behavior a set number of times
for reinforcement? Fixed
• Is the example dealing with the amount of time
that elapses from the behavior till it gets
reinforcement? - Interval
– Reinforcement will NOT be sped up by doing the
behavior more often
– Does the amount time between the behavior and
reinforcement vary? Variable
– Is the amount of time between the behavior and
reinforcement stay the same? Fixed
Schedules of Reinforcement
Operant Conditioning
Class Activity
• 4 Volunteers are needed to demonstrate
schedules of reinforcement
• No punishment will be used.
• You will remain dry for the entire activity.
Variable Ratio
• 1:1/ 7:1 / 4:1 / 12:1 / 8:1 / 19:1 / 3:1 / 2:1 / 2:1 /
5:1 / 16:1 / 11:1 / 3:1 / 8:1 / 4:1
Fixed Ratio
• 7:1 / 7:1 / 7:1 / 7:1 / 7:1,…. 15 times
Fixed Interval
• 10 sec:1 / 10 sec:1 / 10 sec:1 / ,… 15 times
Variable Interval
• 6 sec:1 / 8 sec:1 / 10 sec:1 / 3 sec:1 / 7 sec:1 / 14
sec:1 / 15 sec:1 / 8 sec:1 / 5 sec:1 / 12 sec:1 / 6
sec:1 / 9 sec:1 / 13 sec:1/15 sec:1 / 8 sec:1
New Understandings
of Operant
Conditioning:
The Role of Cognition
Skinner & Thorndike
• Believed that cognitions (thoughts),
perceptions and expectations have no place
in psychology.
• This is because they cannot be studied
through observation and therefore were seen
as not being objective.
Cognitive Aspects of Operant
Conditioning
• Latent learning—learning that occurs in the
absence of reinforcement, but is not
demonstrated until a reinforcer is available
• Cognitive map—term for a mental
representation of the layout of a familiar
environment
• Learned helplessness—phenomenon where
exposure to inescapable and uncontrollable
aversive events produces passive behavior
Latent Learning
• Learning that takes place in absence
of an apparent reward
• Idea developed by E.C. Tolman
E.C. Tolman’s Rat Maze Experiment
• Three groups of rats were trained to run a maze.
• The control group, Group 1, was fed upon reaching the goal.
• The first experimental group, Group 2, was not rewarded
for the first six days of training, but found food in the goal
on day seven and everyday thereafter.
• The second experimental group, Group 3, was not rewarded
for the first two days, but found food in the goal on day
three and everyday thereafter.
Tolman’s Rat Maze Experiment
(continued)
• Both of the experimental groups demonstrated fewer errors when
running the maze the day after the transition from no reward to reward
conditions. The marked performance continued throughout the rest of
the experiment.
• This suggested that the rats had learned during the initial trials of no
reward and were able to use a "cognitive map" of the maze when the
rewards were introduced.
• The initial learning that occurred during the no reward trials was what
Tolman referred to as latent learning.
• He argued that humans engage in this type of learning everyday as we
drive or walk the same route daily and learn the locations of various
buildings and objects. Only when we need to find a building or object
does learning become obvious.
Cognitive Map
• A mental representation of a place
• Experiments showed rats could learn a maze without any
reinforcements
• See a modern day example of Tollman’s experiment where
they change the maze on the rat (2 min)
Latent Learning & Cognitive
Maps
• Play “Cognitive Processes in
Learning” (6:25) Segment #12 from
Psychology: The Human Experience.
Other evidence that we do think!
• Animals on a fixed-interval reinforcement
schedule though respond more frequently as
the time approaches for their reinforcer as if
they expect that the response will produce
the reward
Overjustification Effect
• The effect of promising a reward for doing what
someone already likes to do
• The reward may lessen and replace the person’s
original, natural motivation, so that the behavior
stops if the reward is eliminated
– The person may now see the reward, rather than intrinsic interest,
as the motivation for performing the task.
– “If I have to be bribed into doing this, then it’s not worth doing for
its own sake.”
• Rewards do help increase interest when used to indicate a
job well done
Learned Helplessness
• Dogs in electrified cage at first not able to escape
the impending shock.
• Later, all they had to do was cross to the other side
but they didn’t even try.
•The dogs had
learned they were
“helpless” to avoid
the shock and just sat
there and took it
without trying to
escape.
Learned Helplessness
• Exposure to inescapable and uncontrollable
aversive events produces passive behavior. If an
animal believes or expects it cannot escape a
certain result, it will give up trying to do a
behavior that could result in it escaping from the
bad result.
• To overcome this, one must establish a sense of
control over one’s environment and see some
success.
New Understandings
of Operant
Conditioning:
The Role of Biology
Biological Predispositions
• Animal training issues –
easier to train behaviors that
are closer to natural
behaviors using a natural
reinforcer (food).
•Instinctive drift—naturally
occurring behaviors that
interfere with operant
responses.
•What happens when a trained
tiger shows instinctive drift?
Classical Conditioning vs.
Operant Conditioning