Download Operant Conditioning

Document related concepts

Social psychology wikipedia , lookup

Bullying and emotional intelligence wikipedia , lookup

Learning theory (education) wikipedia , lookup

Prosocial behavior wikipedia , lookup

Motivation wikipedia , lookup

Behavioral modernity wikipedia , lookup

Observational methods in psychology wikipedia , lookup

Abnormal psychology wikipedia , lookup

Symbolic behavior wikipedia , lookup

Classical conditioning wikipedia , lookup

Thin-slicing wikipedia , lookup

Transtheoretical model wikipedia , lookup

Attribution (psychology) wikipedia , lookup

Theory of planned behavior wikipedia , lookup

Parent management training wikipedia , lookup

Theory of reasoned action wikipedia , lookup

Neuroeconomics wikipedia , lookup

Applied behavior analysis wikipedia , lookup

Sociobiology wikipedia , lookup

Verbal Behavior wikipedia , lookup

Psychological behaviorism wikipedia , lookup

Descriptive psychology wikipedia , lookup

Social cognitive theory wikipedia , lookup

Insufficient justification wikipedia , lookup

Behavior analysis of child development wikipedia , lookup

Behaviorism wikipedia , lookup

Operant conditioning wikipedia , lookup

Transcript
Operant Conditioning
Comparing Classical and
Operant Conditioning
• Both classical and operant conditioning use acquisition,
extinction, spontaneous recovery, generalization, and
discrimination.
• Classical conditioning uses reflexive behavior - behavior
that occurs as an automatic response to some stimulus.
– Ask: Is the behavior something the animal can control? NO. Does
the animal have a choice in how to behave? NO. - Classical
conditioning.
• Operant conditioning uses operant or voluntary behavior
– voluntary behavior that is shaped by consequences.
– Ask: Is the behavior something the animal can control? YES. Does
the animal have a choice in how to behave? YES. - Operant
Conditioning.
What is Operant
Conditioning?
Operant Conditioning
• A type of learning in which the frequency of a
behavior depends on the consequence that
follows that behavior
• The frequency will increase if the consequence
is reinforcing to the subject.
• The frequency will decrease if the consequence
is not reinforcing or punishing to the subject.
The Law of Effect
Edward L. Thorndike ( 1874–1949)
Thorndike’s Puzzle Box
Thorndike’s Puzzle Box
• “Thorndike’s Puzzle Box” Video #8
from Worth’s Digital Media Archive
for Psychology.
B. F. Skinner (1904–1990)
B.F. Skinner (1904-1990)
• Believed that internal factors like thoughts,
emotions, and beliefs could not be used to
explain behavior. Instead said that new
behaviors were actively chosen by the organism
• Developed the fundamental principles and
techniques of operant conditioning and devised
ways to apply them in the real world
• Designed the Skinner Box, or operant chamber
The Skinner Box
Reinforcement/Punishment
• Reinforcement - Any consequence that
increases the likelihood of the behavior it
follows
– Reinforcement is ALWAYS GOOD!!!
• Punishment - Any consequence that decreases
the likelihood of the behavior it follows
• The subject determines if a consequence is
reinforcing or punishing
Types of
Reinforcement
Principles of Reinforcement
Stimulus is
presented or
added to animal’s
environment…
Stimulus is
removed or taken
away from
animal’s
environment…
Reinforcing/Desirable
Stimulus
Aversive/UnDesirable
Stimulus
Positive (+)
Reinforcement
Positive (+)
Punishment
Add something you DO LIKE.
Behavior Increases
Add something you DO NOT
LIKE.
Behavior Decreases
Chocolate
More chores
Negative (-)
Punishment
Negative (-)
Reinforcement
TAKES AWAY something you
DO LIKE.
Behavior Decreases
TAKES AWAY something you
DO NOT LIKE.
Behavior Increases
No TV
Fewer chores
Positive Reinforcement
• Strengthens a response by presenting a
desirable stimulus after a response
• Anything that increases the likelihood of a
behavior by following it with a desirable
event or state
• The subject receives something they want
(it is added or given)
• Will strengthen the behavior
Positive Reinforcement
Negative Reinforcement
“Reward through ESCAPE”
• Strengthens a response by reducing or removing
an aversive (disliked) stimulus
• Anything that increases the likelihood of a
behavior by following it with the removal of an
undesirable event or state
• Something the subject doesn’t like is removed
(subtracted)
• Will strengthen the behavior
• Still a REWARD!!!!! It’s desirable.
Negative Reinforcement
Positive/Negative Reinforcement
BOTH ARE GOOD THINGS!!!
Billy Throws a Tantrum
• Billy throws a tantrum, his parents give in for the
sake of peace and quiet.
• How is this an example of positive reinforcement?
• The child’s tantrum is reinforced when the parents
give in (pos. reinforcement).
• How is this ALSO an example of negative
reinforcement?
• The parents’ behavior will be reinforced when
Billy stops screaming (neg. reinforcement). They
may continue to give in.
Primary Versus
Secondary
Reinforcement
Primary Reinforcement
• Something that is naturally
reinforcing
• Examples: food, warmth, water, etc.
• The item is reinforcing in and of itself
Secondary Reinforcement
• Something that a person has learned
to value or finds rewarding because it
is paired or associated with a primary
reinforcer
• Money is a good example.
• So are grades and signs of respect &
approval.
Immediate Versus
Delayed
Reinforcement
Immediate Reinforcers
• Immediate reinforcers – behaviors that
immediately precedes the reinforcer
becomes more likely to occur
– (This is true when training animals.
Can’t wait for a long time before
reinforcing or the animal. It won’t know
what behavior you are reinforcing)
Delayed Reinforcers
• Also called Delayed Gratification –
forgoing a small immediate reinforcement
for a greater reinforcement later.
• Humans do this with paychecks, grades.
• When do we not do this?
– Stay up late to watch TV when next day we’re
tired.
– Smoke for satisfaction now when later it will
kill us.
Punishment:
The Process of
Punishment
Types of Punishment
• An undesirable consequence
following a behavior
• The behavior ends a desirable state.
• Its effect is opposite of reinforcement
– it decreases the frequency of
behavior
Positive Punishment
(Punishment by Application)
• Something is added to the environment you
do NOT like.
• A verbal reprimand, extra chores, or
something painful like a spanking
Negative Punishment
(Punishment by Removal)
• Something is taken away that you DO LIKE.
• Lose a privilege, no TV, no dessert, grounded (lose
freedom).
• “Time out” for toddlers takes them away from their
activity.
The Good Effects of Punishment
• Punishment can effectively control certain
behaviors if…
– It comes immediately after the undesired
behavior
– It is consistent and not occasional
• Especially useful if teaching a child not to
do a dangerous behavior
• Most still suggest reinforcing an
incompatible behavior rather than using
punishment
Bad Effects of Punishment
• Does not teach or promote alternative, acceptable
behavior.
• Only tells what NOT to do while reinforcement
tells what to do.
• Doesn’t prevent the undesirable behavior when
away from the punisher in a “safe setting”
• Can lead to fear of the punisher, anxiety, and
lower self-esteem
• Children who are punished physically may learn
to use aggression as a means to solve problems.
How is Punishment & Reinforcement being used to treat severely
autistic and/or violent children?
See CNN video clip from Anderson Cooper 360.
Do you think they should be using these conditioning methods on
these kids?
Extinction
• In operant conditioning, the loss of a
conditioned behavior when
consequences no longer follow it.
• The subject no longer responds since
the reinforcement or punishment has
stopped.
Some Reinforcement
Procedures:
Shaping
Shaping Principles
• Shaping - procedure in which rewards, such as food,
gradually guide an animal’s behavior toward a desired
behavior.
• Successive approximations - shaping method in which you
reward responses that are ever closer to the final desired
behavior and ignore all other responses.
• Shaping nonverbal animals can show what they perceive.
Train an animal to discriminate between classes of events or
objects.
– After being trained to discriminate between flowers, people, cars,
and chairs, a pigeon can usually identify in which of these
categories a new pictured object belongs
Skinner attached some horizontal stripes to the wall which he then used to gauge the
dog's responses of lifting its head higher and higher. Then, he simply set about shaping
a jumping response by flashing the strobe (and simultaneously taking a picture),
followed by giving a meat treat, each time the dog satisfied the criterion for
reinforcement. The result of this process is shown below, as it was in LOOK magazine,
in terms of the pictures taken at different points in the shaping process. Within 20
minutes, Skinner had Agnes "running up the wall"
For the second shaping demonstration, Skinner trained Agnes to press the
pedal and pop the top on the wastebasket. Again, the photographer's flash
served as the conditioned reinforcer, and each step in the process was
photographed. The results are shown below.
Schedules of
Reinforcement
Continuous reinforcement
• A schedule of reinforcement in which a reward
follows every correct response
• Learning occurs rapidly
• But the behavior will extinguish quickly once the
reinforcement stops.
– Once that reliable candy machine eats your money
twice in a row, you stop putting money into it.
Partial Reinforcement
• A schedule of reinforcement in which a
reward follows only some correct responses
• Learning of behavior will take longer
• But will be more resistant to extinction
• Includes the following types:
– Fixed-interval and variable interval
– Fixed-ratio and variable-ratio
Fixed-Ratio Schedule
• A partial reinforcement schedule that
rewards a response only after some defined
number of correct responses
• The faster the subject responds, the more
reinforcements they will receive.
• i.e. piece work: You get $5 for every 10
widgets you make.
Variable-Ratio Schedule
• A partial reinforcement schedule that rewards an
unpredictable average number of correct
responses
• High rates of responding with little pause in order
to increase chances of getting reinforcement
• This schedule is very resistant to extinction.
• Sometimes called the “gambler’s schedule”;
similar to a slot machine or fishing
Fixed-Interval Schedule
• A partial reinforcement schedule that
rewards only the first correct response
after some defined period of time
• Produces gradual responses at first and
increases as you get closer to the time of
reinforcement
• Example: a known weekly quiz in a class,
checking cookies after the 10 minute
baking period.
Variable-Interval Schedule
• A partial reinforcement that rewards
the first correct response after an
unpredictable amount of time
• Produces slow and steady responses
• Example: “pop” quiz in a class
Schedules of Reinforcement
Ask Yourself…
• Is the example dealing with the animal doing a
behavior? - Ratio
– Does the number of times the animal does the behavior
vary for reinforcement? Variable
– Does the animal do the behavior a set number of times
for reinforcement? Fixed
• Is the example dealing with the amount of time
that elapses from the behavior till it gets
reinforcement? - Interval
– Does the amount time between the behavior and
reinforcement vary? Variable
– Is the amount of time between the behavior and
reinforcement stay the same? Fixed
Class Activity
• 4 Volunteers are needed to demonstrate
schedules of reinforcement
• No punishment will be used.
• You will remain dry for the entire activity.
Variable Ratio
• 1:1/ 7:1 / 4:1 / 12:1 / 8:1 / 19:1 / 3:1 / 2:1 / 2:1 /
5:1 / 16:1 / 11:1 / 3:1 / 8:1 / 4:1
Fixed Ratio
• 7:1 / 7:1 / 7:1 / 7:1 / 7:1,…. 15 times
Fixed Interval
• 10 sec:1 / 10 sec:1 / 10 sec:1 / ,… 15 times
Variable Interval
• 6 sec:1 / 8 sec:1 / 10 sec:1 / 3 sec:1 / 7 sec:1 / 14
sec:1 / 15 sec:1 / 8 sec:1 / 5 sec:1 / 12 sec:1 / 6
sec:1 / 9 sec:1 / 13 sec:1/15 sec:1 / 8 sec:1
New Understandings
of Operant
Conditioning:
The Role of Cognition
Skinner & Thorndike
• Believed that cognitions (thoughts),
perceptions, and expectations have no place
in psychology.
• This is because they cannot be studied
through observation and therefore were seen
as not being objective.
Latent Learning
• Learning that takes place in absence
of an apparent reward
• Idea developed by E.C. Tolman
E.C. Tolman’s Rat Maze Experiment
• Three groups of rats were trained to run a maze.
• The control group, Group 1, was fed upon reaching the goal.
• The first experimental group, Group 2, was not rewarded
for the first six days of training, but found food in the goal
on day seven and everyday thereafter.
• The second experimental group, Group 3, was not rewarded
for the first two days, but found food in the goal on day
three and everyday thereafter.
Tolman’s Rat Maze Experiment
(continued)
• Both of the experimental groups demonstrated fewer errors when
running the maze the day after the transition from no reward to reward
conditions. The marked performance continued throughout the rest of
the experiment.
• This suggested that the rats had learned during the initial trials of no
reward and were able to use a "cognitive map" of the maze when the
rewards were introduced.
• The initial learning that occurred during the no reward trials was what
Tolman referred to as latent learning.
• He argued that humans engage in this type of learning everyday as we
drive or walk the same route daily and learn the locations of various
buildings and objects. Only when we need to find a building or object
does learning become obvious.
Cognitive Map
• A mental representation of a place
• Experiments showed rats could learn
a maze without any reinforcements
Other evidence that we do think!
• Animals on a fixed-interval reinforcement
schedule though respond more frequently as
the time approaches for their reinforcer as if
they expect that the response will produce
the reward
Overjustification Effect
• The effect of promising a reward for doing what
someone already likes to do
• The reward may lessen and replace the person’s
original, natural motivation, so that the behavior
stops if the reward is eliminated
– The person may now see the reward, rather than intrinsic interest,
as the motivation for performing the task.
– “If I have to be bribed into doing this, then it’s not worth doing for
its own sake.”
• Rewards do help increase interest when used to indicate a
job well done
Learned Helplessness
• Dogs in electrified cage at first not able to escape
the impending shock.
• Later, all they had to do was cross to the other side
but they didn’t even try.
•The dogs had
learned they were
“helpless” to avoid
the shock and just sat
there and took it
without trying to
escape.
Learned Helplessness
• Exposure to inescapable and uncontrollable
aversive events produces passive behavior. If an
animal believes or expects it cannot escape
undesirable circumstances, it will give up trying to
escaping those circumstances.
• To overcome this, one must establish a sense of
control over one’s environment and see some
success.
Biological Predispositions
• Animal training issues –
easier to train behaviors that
are closer to natural
behaviors using a natural
reinforcer (food).
•Instinctive drift—naturally
occurring behaviors that
interfere with operant
responses.
•What happens when a trained
tiger shows instinctive drift?
Classical Conditioning vs.
Operant Conditioning