Download Operant Conditioning PP

Document related concepts

Psychophysics wikipedia , lookup

Bullying and emotional intelligence wikipedia , lookup

Motivation wikipedia , lookup

Prosocial behavior wikipedia , lookup

Behavioral modernity wikipedia , lookup

Classical conditioning wikipedia , lookup

Abnormal psychology wikipedia , lookup

Observational methods in psychology wikipedia , lookup

Symbolic behavior wikipedia , lookup

Organizational behavior wikipedia , lookup

Residential treatment center wikipedia , lookup

Thin-slicing wikipedia , lookup

Neuroeconomics wikipedia , lookup

Transtheoretical model wikipedia , lookup

Parent management training wikipedia , lookup

Attribution (psychology) wikipedia , lookup

Theory of planned behavior wikipedia , lookup

Psychological behaviorism wikipedia , lookup

Theory of reasoned action wikipedia , lookup

Sociobiology wikipedia , lookup

Applied behavior analysis wikipedia , lookup

Social cognitive theory wikipedia , lookup

Verbal Behavior wikipedia , lookup

Descriptive psychology wikipedia , lookup

Insufficient justification wikipedia , lookup

Behavior analysis of child development wikipedia , lookup

Behaviorism wikipedia , lookup

Operant conditioning wikipedia , lookup

Transcript
Operant Conditioning
Classical v. Operant Conditioning
  Both classical and operant conditioning use acquisition,
extinction, spontaneous recovery, generalization, and
discrimination.
  Classical conditioning uses reflexive behavior - behavior that
occurs as an automatic response to some stimulus that come
before the behavior.
  Ask: Is the behavior something the animal does NOT control?
YES. Does the animal have a choice in how to behave? NO. Classical conditioning.
  Operant conditioning uses operant or voluntary behavior –
voluntary behavior that is shaped by consequences that come
after the behavior
  Ask: Is the behavior something the animal can control? YES.
Does the animal have a choice in how to behave? YES. Operant Conditioning.
What is Operant
Conditioning?
Operant Conditioning
• Learning where frequency of a behavior depends
on the consequence that follows that behavior
• The frequency will increase if the consequence is
reinforcing to the subject.
• The frequency will decrease if the consequence is
not reinforcing or punishing to the subject.
The Law of Effect
Edward Thorndike (1874-1949)
• Author of the law of effect
• Behaviors with favorable
consequences will occur more
frequently.
• Behaviors with unfavorable
consequences will occur less
frequently.
• Created puzzle boxes for
research on cats
Thorndike’s Puzzle Box
“Thorndike’s Puzzle Box”
Video #8 from Worth’s Digital Media Archive (2 min)
Thorndike’s Puzzle Box
First Trial
in Box
Situation:
stimuli
inside of
puzzle box
Scratch at bars
Push at ceiling
Dig at floor
Howl
Etc.
After Many
Trials in Box
Situation:
stimuli
inside of
puzzle box
Scratch at bars
Push at ceiling
Dig at floor
Howl
Etc.
Etc.
Etc.
Press lever
Press lever
B. F. Skinner (1904–1990)
B.F. Skinner (1904-1990)
• Believed that internal factors like thoughts,
emotions, and beliefs could not be used to
explain behavior.
• Instead said that new behaviors were actively
chosen by the organism
• Looked at “Operants” or active behaviors that
are used on the environment to generate
consequences
• Developed the fundamental principles and
techniques of operant conditioning and devised
ways to apply them in the real world
The Skinner Box/Operant Chamber
Skinner’s Air Crib:
A room fit for a…Baby!
To read more on this invention: Click Here!
Reinforcement/Punishment
• Reinforcement - Any consequence that
increases the likelihood of the behavior it
follows
– Reinforcement is ALWAYS GOOD!!!
• Punishment - Any consequence that decreases
the likelihood of the behavior it follows
• The subject determines if a consequence is
reinforcing or punishing
Types of
Reinforcement:
- Always GOOD
Positive
Reinforcement
• Strengthens a response by presenting a stimulus
that you like after a response
• Anything that increases the likelihood of a
behavior by following it with a desirable event or
state
• The subject receives something they want (added)
• Will strengthen the behavior
Positive Reinforcement
Negative Reinforcement
• Strengthens a response by reducing or removing an
aversive (disliked) stimulus
• Anything that increases the likelihood of a behavior by
following it with the removal of an undesirable event
or state
• Something the subject doesn’t like is removed
(subtracted)
• Will strengthen the behavior
• Neg. Rein. Allows you to either:
– Escape something you don’t like that is already present
(Neg. Rein. By Escape)
– Avoid something before it occurs (Neg. Rein. By Avoidance)
Negative Reinforcement
Positive/Negative Reinforcement
BOTH ARE GOOD THINGS!!!
Punishment:
Always BAD
Punishment
• An undesirable event following a behavior
• Behavior ends a desirable event or state
• Its effect is opposite that of reinforcement – it
decreases the frequency of behavior
Positive Punishment
(Punishment by Application)
• Something is added to the environment you
do NOT like.
• A verbal reprimand or something painful
like a spanking
Negative Punishment
(Punishment by Removal)
• Something is taken away that you DO LIKE.
• Lose a privilege.
The Good Effects of Punishment
• Punishment can effectively control
certain behaviors if…
– It comes immediately after the undesired
behavior
– It is consistent and not occasional
• Especially useful if teaching a child not
to do a dangerous behavior
• Most still suggest reinforcing an
incompatible behavior rather than using
punishment
Bad Effects of Punishment
• Does not teach or promote alternative, acceptable behavior.
– Only tells what NOT to do while reinforcement tells what to do.
• 4 Drawbacks to Physical Punishment:
1. Punished behavior is suppressed, not forgotten.

If child stops bad behavior, parent is negatively reinforced to use physical
punishment again
2. Doesn’t prevent the undesirable behavior when away from the
punisher in a “safe setting” (discrimination)
3. Can lead to fear of the punisher, anxiety, and lower self-esteem
4. Children who are punished physically may learn to use
aggression as a means to solve problems.
Reinforcement vs. Punishment
Reinforcing/Desirable
Stimulus
Aversive/UnDesirable
Stimulus
Positive (+)
Reinforcement
Positive (+)
Punishment
Add something you DO LIKE.
Behavior Increases
Add something you DO NOT
LIKE.
Behavior Decreases
Negative (-)
Punishment
Negative (-)
Reinforcement
TAKES AWAY something you
DO LIKE.
Behavior Decreases
TAKES AWAY something you
DO NOT LIKE.
Behavior Increases
Stimulus is presented or
added to animal’s
environment…
Stimulus is removed or taken
away from animal’s
environment…
How is Neg. Reinforcement
different from Punishment?
• Negative Reinforcement will always increase a
behavior
• Punishment will always decrease a behavior
• Negative Reinforcement is something YOU DO to
take away something bad.
• Punishment is something DONE TO YOU that is
bad and makes you stop doing a behavior.
Types of Reinforcement & Punishment
How is Punishment & Reinforcement being used to treat severely
autistic and/or violent children?
See CNN video clip from Anderson Cooper 360.
Do you think they should be using these conditioning methods on
these kids?
Primary Versus
Secondary
Reinforcement
Primary Reinforcement
•
•
•
•
Something that is naturally satisfying
Examples: food, warmth, water, etc.
The item is reinforcing in and of itself
If on a deserted island, these are what you’d
want!
Conditioned/Secondary
Reinforcement
• Something that a person has Learned to
value or finds rewarding because it is paired
or associated with a primary reinforcer
• Money is a good example.
• So are grades and signs of respect &
approval.
Immediate Versus
Delayed
Reinforcement
Immediate Reinforcers
• Immediate reinforcers – behaviors that
immediately precedes the reinforcer becomes
more likely to occur
– (This true when training animals. Can’t wait for
a long time before reinforcing or the animal
won’t know what behavior you are reinforcing)
Delayed Reinforcers
• Also called Delayed Gratification – forgoing a small
immediate reinforcement for a greater
reinforcement later.
– Humans do this with paychecks, grades.
• When do we not do this?
– Stay up late to watch TV when next day we’re tired
– Smoke for satisfaction now when later it will kill us
Immediate/Delayed
Reinforcement
• Immediate reinforcement is more effective
than delayed reinforcement for initial learning
• Delayed Reinforcement is more resistant to
extinction.
– Ability to delay gratification predicts higher achievement
Discriminative Stimuli
• An environmental stimulus that triggers you to do
a certain behavior that will have a consequence.
• In the presence of a specific environmental
stimulus (discriminative stimuli) we emit a
particular behavior (the operant) which is
followed by a consequence (reinforcement or
punishment)
• Example: A ringing phone is a discriminative
stimulus that sets a particular response of picking
it up and speaking in it
Extinction
• In operant conditioning,
the loss of a conditioned
behavior when
consequences no longer
follow it.
• The subject no longer
responds since the
reinforcement or
punishment has stopped.
Thoughts from Skinner:
• Skinner believed from the moment of birth, the environment shapes
and determines your behavior through reinforcing or punishing
consequences. – You don’t really have Free Will
• “B.F. Skinner Interview” (4 min) – Discusses Schedules of
Reinforcemtn & Free Will
– Video #9 from Worth’s Digital Media Archive for Psychology.
A person does
not act upon the
world, the world
acts upon him.
Parts of Operant Conditioning
Discriminative
Stimulus
Specific
environmental
stimulus
Operant Response
Voluntary behavior
Gas gage on empty
Fill car with gas
Wallet on sidewalk
Give Wallet to
Security
Consequence
Effect on Future
Behavior
Event that will make
the operant response
more or less likely to
reoccur
If reinforcement =
more likely to reoccur
Avoid running out of
gas.
Get $50 Reward
If punishment = less
likely to reoccur
Some Reinforcement
Procedures:
Shaping
Shaping
• Reinforcement of behaviors that are more and more
similar to the one you want to occur
• Technique used to establish a new behavior
Shaping Principles
• Skinner box - soundproof box with a bar that an animal
presses or pecks to release a food or water reward, and a
device that records these responses.
• Shaping - procedure in which rewards, such as food,
gradually guide an animal’s behavior toward a desired
behavior.
• Successive approximations - shaping method in which
you reward responses that are ever closer to the final
desired behavior and ignore all other responses.
• Shaping nonverbal animals can show what they perceive.
Train an animal to discriminate between classes of events
or objects.
– After being trained to discriminate between flowers, people,
cars, and chairs, a pigeon can usually identify in which of these
categories a new pictured object belongs
Skinner attached some horizontal stripes to the wall which he then used to gauge the
dog's responses of lifting its head higher and higher. Then, he simply set about shaping
a jumping response by flashing the strobe (and simultaneously taking a picture),
followed by giving a meat treat, each time the dog satisfied the criterion for
reinforcement. The result of this process is shown below, as it was in LOOK magazine,
in terms of the pictures taken at different points in the shaping process. Within 20
minutes, Skinner had Agnes "running up the wall"
For the second shaping demonstration, Skinner trained Agnes to press the
pedal and pop the top on the wastebasket. Again, the photographer's flash
served as the conditioned reinforcer, and each step in the process was
photographed. The results are shown below.
Schedules of
Reinforcement
Continuous reinforcement
• A schedule of reinforcement in which a reward
follows every correct response
• Learning occurs rapidly
• But the behavior will extinguish quickly once the
reinforcement stops.
– Once that reliable pop machine eats your money twice in a
row, you stop putting money into it.
Partial Reinforcement
• A schedule of reinforcement in which a reward
follows only some correct responses
• Learning of behavior will take longer
• But will be more resistant to extinction
• Includes the following types:
– Fixed-interval and variable interval
– Fixed-ratio and variable-ratio
Fixed-Ratio Schedule
• A partial reinforcement schedule that
rewards a response only after some set
number of correct responses
• The faster the subject responds, the more
reinforcements they will receive.
• i.e. piece work: You get $5 for every 10
widgets you make.
Variable-Ratio Schedule
• A partial reinforcement schedule that rewards an
unpredictable average number of correct
responses
• High rates of responding with little pause in order
to increase chances of getting reinforcement
• This schedule is very resistant to extinction.
• Vegas Rules! Sometimes called the “gambler’s
schedule”; similar to a slot machine or fishing
Fixed-Interval Schedule
• A partial reinforcement schedule that rewards only the first
correct response after some set period of time
• Produces gradual responses at first and increases as you get
closer to the time of reinforcement
• “Procrastinator Schedule”
• Example: a known weekly quiz in a class, checking
cookies after the 10 minute baking period.
Variable-Interval Schedule
• A partial reinforcement that rewards the first
correct response after an unpredictable
amount of time
• Produces slow and steady responses
• Example: “pop” quiz in a class
• “Are we there yet?” – ask all you want,
doesn’t mean it speeds up when the
reinforcement of arriving will happen
Ask Yourself…
• Can the animal speed up its reinforcement by
doing the behavior? If YES - Ratio
– Does the number of times the animal does the behavior
vary for reinforcement? Variable
– Does the animal do the behavior a set number of times
for reinforcement? Fixed
• Is the example dealing with the amount of time
that elapses from the behavior till it gets
reinforcement? - Interval
– Reinforcement will NOT be sped up by doing the
behavior more often
– Does the amount time between the behavior and
reinforcement vary? Variable
– Is the amount of time between the behavior and
reinforcement stay the same? Fixed
Schedules of Reinforcement
Classical Conditioning vs.
Operant Conditioning
Class Activity
• 4 Volunteers are needed to demonstrate
schedules of reinforcement
• No punishment will be used.
• You will remain dry for the entire activity.
Variable Ratio
• 1:1/ 7:1 / 4:1 / 12:1 / 8:1 / 19:1 / 3:1 / 2:1 / 2:1 /
5:1 / 16:1 / 11:1 / 3:1 / 8:1 / 4:1
Fixed Ratio
• 7:1 / 7:1 / 7:1 / 7:1 / 7:1,…. 15 times
Fixed Interval
• 10 sec:1 / 10 sec:1 / 10 sec:1 / ,… 15 times
Variable Interval
• 6 sec:1 / 8 sec:1 / 10 sec:1 / 3 sec:1 / 7 sec:1 / 14
sec:1 / 15 sec:1 / 8 sec:1 / 5 sec:1 / 12 sec:1 / 6
sec:1 / 9 sec:1 / 13 sec:1/15 sec:1 / 8 sec:1