Download Module 19 Operant Conditioning Operant Conditioning

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bullying and emotional intelligence wikipedia , lookup

Motivation wikipedia , lookup

Symbolic behavior wikipedia , lookup

Learning theory (education) wikipedia , lookup

Prosocial behavior wikipedia , lookup

Observational methods in psychology wikipedia , lookup

Behavioral modernity wikipedia , lookup

Psychophysics wikipedia , lookup

Abnormal psychology wikipedia , lookup

Transtheoretical model wikipedia , lookup

Thin-slicing wikipedia , lookup

Neuroeconomics wikipedia , lookup

Attribution (psychology) wikipedia , lookup

Theory of planned behavior wikipedia , lookup

Theory of reasoned action wikipedia , lookup

Counterproductive work behavior wikipedia , lookup

Sociobiology wikipedia , lookup

Applied behavior analysis wikipedia , lookup

Parent management training wikipedia , lookup

Descriptive psychology wikipedia , lookup

Verbal Behavior wikipedia , lookup

Classical conditioning wikipedia , lookup

Behavior analysis of child development wikipedia , lookup

Social cognitive theory wikipedia , lookup

Psychological behaviorism wikipedia , lookup

Insufficient justification wikipedia , lookup

Behaviorism wikipedia , lookup

Operant conditioning wikipedia , lookup

Transcript
10/17/2013
Module 19 Operant
Conditioning
© 2013 Worth
Publishers
Operant Conditioning
Operant conditioning involves
adjusting to the consequences
of our behaviors.
Response:
balancing a ball
How it works:
An act of chosen behavior (a
“response”) is followed by a
reward or punitive feedback
from the environment.
Results:
 Rewarded behavior is more
likely to be tried again.
 Punished behavior is less
likely to be chosen in the
future.
Consequence:
receiving food
Behavior
strengthened
1
10/17/2013
Thorndike’s Law of Effect
Edward Thorndike placed cats in a puzzle box;
they were rewarded with food (and freedom)
when they solved the puzzle.
Thorndike noted that the cats took less time
to escape after repeated trials and rewards.
Thorndike’s law of effect: behaviors followed
by favorable consequences become more
likely, and behaviors followed by unfavorable
consequences become less likely.
B.F. Skinner: Behavioral Control & The
Operant Chamber
B. F. Skinner extended Thorndike’s
principles much more broadly.
B. F. Skinner, like Ivan Pavlov,
pioneered more controlled methods of
studying conditioning.
Bar or
lever that
an animal
presses,
randomly
at first,
later for
reward
Food/water
dispenser to
provide the
reward
B.F. Skinner trained
pigeons to play ping
pong, and guide a
video game missile.
http://youtu.be/vGazyH6fQQ4
2
10/17/2013
Reinforcement
 Reinforcement:
feedback from the
environment that makes
a behavior more likely to
be done again.
 Positive +
reinforcement: the
reward is adding
something desirable
 Negative reinforcement: the
reward is removing
something unpleasant
This meerkat has just
completed a task out
in the cold
For the meerkat,
this warm light is
desirable.
Operant Effect: Punishment
Punishments have the opposite effects of reinforcement.
These consequences make the target behavior less likely
to occur in the future.
+ Positive
Punishment
You ADD something
unpleasant/aversive
(ex: spank the child)
- Negative
Punishment
You TAKE AWAY
something pleasant/
desired (ex: no TV
time, no attention)-MINUS is the
“negative” here
Positive does not mean “good” or “desirable” and
negative does not mean “bad” or “undesirable.”
3
10/17/2013
Consequence matrix
Stimulus Type
Add a Stimulus
Remove a Stimulus
Appetitive Stimulus
(Something
desired)
Aversive Stimulus
(something not
desired)
7
Reinforcement Types
 A primary reinforcer is a stimulus
that meets a basic need or
otherwise is intrinsically desirable,
such as food, sex, fun, attention,
or power.
 A secondary/conditioned
reinforcer is a stimulus that has
become associated with a primary
reinforcer.
4
10/17/2013
A Human Talent:
Responding to Delayed Reinforcers
 Dogs learn from immediate reinforcement; a
treat five minutes after a trick won’t reinforce
the trick.
 Delayed Reinforcer: A reinforcer that is
delayed in time for a certain behavior. A
paycheck that comes at the end of a week.
 Humans have the ability to link a
consequence to a behavior even if they aren’t
linked sequentially in time. However, We may
be inclined to engage in small immediate
reinforcers (watching TV) rather than large
delayed reinforcers (feeling alert tomorrow)
 Delaying gratification, a skill related to
impulse control, enables longer-term goal
setting. http://youtu.be/jQvBrEEYS20
How often should we reinforce?
Reinforcement Schedules
 In continuous reinforcement (giving a reward after
the target every single time), the subject acquires the
desired behavior quickly.
 In partial/intermittent reinforcement (giving
rewards part of the time), the target behavior takes
longer to be acquired/established but persists longer
without reward.
5
10/17/2013
Different Schedules of
Partial/Intermittent Reinforcement
We may schedule
our reinforcements
based on an
interval of time
that has gone by.
 Fixed interval schedule:
 Every so often
 Variable interval schedule:
 Unpredictably often
We may plan for a
certain ratio of
rewards per
number of
instances of the
desired behavior.
 Fixed ratio schedule:
 Every so many behaviors
 Variable ratio schedule:
 After an unpredictable
number of behaviors
Which Schedule of Reinforcement is This?
Ratio or Interval? Fixed or Variable?
1.
2.
3.
4.
5.
6.
7.
Rat gets food every third time it presses the lever
FR
Getting paid weekly no matter how much work is done FI
Getting paid for every ten boxes you make
FR
Hitting a jackpot sometimes on the slot machine
VR
Checking cell phone all day; sometimes getting a text
VI
Buy eight pizzas, get the next one free
FR
Fundraiser averages one donation for every eight houses VR
visited
8. Kid has tantrum, parents sometimes give in
VR
FI
9. Repeatedly checking mail until paycheck arrives
6
10/17/2013
When is punishment
effective?
 Punishment works best in natural
settings when we encounter
punishing consequences from
actions such as reaching into a fire.
 In that case, operant conditioning
helps us to avoid dangers.
 Punishment is less effective when
we try to artificially create
punishing consequences for
other’s choices;
 Severity of punishments is not
as helpful as making the
punishments immediate and
certain.
Applying operant conditioning to parenting
Problems with Physical Punishment
 Punished behaviors may simply be
suppressed, and restart when the
punishment is over.
 Instead of learning behaviors, the child
may learn to discriminate among
situations, and avoid those in which
punishment might occur.
 Instead of behaviors, the child might
learn an attitude of fear or hatred,
which can interfere with learning. This
can generalize to a fear/hatred of all
adults or many settings.
 Physical punishment models aggression
and control as a method of dealing
with problems.
7
10/17/2013
Summary: Types of Consequences
Adding stimuli
Subtract stimuli
Outcome
Positive +
Reinforcement
(You get candy)
Negative –
Reinforcement
(I stop yelling)
Strengthens
target behavior
(You do chores)
Positive +
Punishment
(You get spanked)
Negative –
Punishment
(No cell phone)
Reduces target
behavior
(cursing)
uses desirable
stimuli
If the organism is
learning associations
between its behavior
and the resulting
events, it is...
uses unpleasant
stimuli
Operant vs. Classical
Conditioning
operant conditioning
If the organism is
learning associations
between events that it
does not control, it is...
classical conditioning
8
10/17/2013
Operant and Classical Conditioning are
Different Forms of Associative Learning
Operant conditioning:
Classical conditioning:



involves operant behavior,
chosen behaviors which
“operate” on the environment
 these behaviors become
these reactions to
associated with consequences
unconditioned stimuli (US)
which punish (decrease) or
become associated with
reinforce (increase) the
neutral (thenconditioned)
operant behavior
stimuli
There is a contrast in the
process of conditioning.
involves respondent behavior,
reflexive, automatic reactions
such as fear or craving
The experimental (neutral)
stimulus repeatedly precedes the
respondent behavior, and
eventually triggers that behavior.
The experimental (consequence)
stimulus repeatedly follows the
operant behavior, and eventually
punishes or reinforces that
behavior.
Associative and Cognitive Learning
 Associative Learning 
Classical
conditioning:
learning to link two
stimuli in a way that
helps us anticipate
an event to which
we have a reaction
Operant
conditioning:
changing
behavior choices
in response to
consequences
Cognitive learning: acquiring
new behaviors and
information through
observation and information,
rather than by direct
experience
9