Download Lectures 8 & 9 - Operant Conditioning

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Motivation wikipedia , lookup

Learning theory (education) wikipedia , lookup

Neuroeconomics wikipedia , lookup

Attribution (psychology) wikipedia , lookup

Theory of planned behavior wikipedia , lookup

Thin-slicing wikipedia , lookup

Theory of reasoned action wikipedia , lookup

Descriptive psychology wikipedia , lookup

Adherence management coaching wikipedia , lookup

Applied behavior analysis wikipedia , lookup

Insufficient justification wikipedia , lookup

Psychophysics wikipedia , lookup

Behavior analysis of child development wikipedia , lookup

Psychological behaviorism wikipedia , lookup

Verbal Behavior wikipedia , lookup

Classical conditioning wikipedia , lookup

Behaviorism wikipedia , lookup

Operant conditioning wikipedia , lookup

Transcript
Categories Of Behavior
Voluntary or
operant
Unconditioned
Conditioned
•Looking
•Babbling
•Crawling
•Reading
•Writing
•Fence jumping
Involuntary or •Pupillary response
to bright light
respondent
•GSR response to
loud noise
•GSR when telling a
lie
•Blushing
CLASSICAL CONDITIONING
• Context of embarrassing situation
-> blushing
• Odor of food that once made you sick
-> nausea
• Sight of parent while raiding cookie jar > fear
Edward L. Thorndike
QuickT ime ™an d a TIFF (Un compr ess ed) d ecomp res sor a re ne eded to se e th is p ic tu re.
1874-1949
Thorndike’s Puzzle Box
John B. Watson, father of Behaviorism
“Give me a dozen healthy infants, wellformed, and my own specified world to
bring them up and I’ll guarantee to take
any one at random and train him to
become any type of specialist I might
select--doctor, lawyer, artist, merchantchief, and, yes, even beggerman thief,
regardless of his talents, penchants,
tendencies, abilities, vocations, and
race of his ancestors” (Watson, 1925).
B.F. Skinner
1904-1990
Skinner Box
Pigeon in Operant Chamber
“SKINNER” BOX
QuickTime™ and a TI FF (Uncompressed) decompressor are needed to see this pict ure.
What Operant Conditioning can achieve
through Shaping
The Method of Successive Approximations
Classical Conditioning:
US & CS elicit an involuntary response
US -> UR
CS -> CR
Instrumental Conditioning:
Voluntary response produces a reinforcer (reward)
R -> SR
Classical Conditioning
= Pavlovian Conditioning
= Type S Conditioning
Instrumental conditioning
= Operant conditioning
= Trial and error conditioning
= Type R conditioning
Type S vs. Type R
Conditioning
LAW OF EFFECT
•Thorndike: Responses that are followed by pleasurable
effect is stamped in; responses followed by unpleasurable
(painful events) are stamped out.
• Skinner: Rate of emitting responses that are followed
by a positive reinforcer is increased; by a negative
reinforcer is decreased.
• Thorndike: Responses trained by trial and error.
• Skinner: Responses shaped by method of successive
approximation.
Instrumental Conditioning
• Doing chores
->
money
• Doing chores
->
praise
• Telling a lie to avoid blame
->
avoidance
• Putting on a coat to remove
->
removal chill
• Getting a speeding ticket
->
punishment
Basic Conditioning Procedures
•
Instrumental conditioning
– Type R conditioning
– Operant conditioning
– Trial and Error Learning
• Pavlovian Conditioning
– Type S Conditioning
– Respondent Conditioning
TYPES OF REINFORCERS
Positive
• Primary
[S+R]
food, drink, odors
• Secondary
[S+r]
approval, money
•Primary
[S-R]
loud noise, shock, bright light
•Secondary
[S-r]
angry look, bad grade, fine
Negative
INSTRUMENTAL CONDITIONING
(Type R)
• 2-term contingency:
–response -> reinforcement
–R -> SR
–(bar press) -> (food)
• Nature of reinforcer can vary:
– Positive - S+R, S+r
– Negative - S-R, S-r
– Primary - S+R, S-R
– Secondary - S+r, S-r
CONTINGENCIES OF REINFORCEMENT:
R-> S+R
Reward training (primary reinforcement)
R-> S-R
Punishment (primary reinforcement)
R-> S+r
Positive secondary reinforcement
R-> S-r
Negative secondary reinforcement
R -> removes -> S-R
Escape training
R -> postpones -> S-R
R -> SR
Avoidance training
Omission training
Two-term contingency is typically “occasioned” by a
discriminative stimulus (SD)
• SD: R -> SR
• light: bar press -> food
• no light: bar press -> no food
• Nature of discriminative stimuli can vary:
–exteroceptive
–proprioceptive
–interoceptive
Is Punishment Effective?
FUNCTIONS OF A STIMULUS:
Eliciting
(US->UR, C->CR)
Reinforcing
(S+ R, S-R, S+r, S-r)
Discriminative
(SD: R
S :R
SR;
SR)
Discriminative Operant:
• SD: R
SR
• S :R
SR
Types Of Discriminative Stimuli
• Exteroceptive: Stimuli generated by sensory organs.
• Proprioceptive: Stimuli generated by muscles and
tendons, e.g., doing something by “feel” - knowing where
you are in the dark
•Interoceptive: Stimuli generated by internal organs; that
are innervated by the autonomic nervous system.
Skinner’s Theory of Chaining
D
r/D
r/D
r/D
Sn-3 :Rn-3 -> Sn-2 :Rn-2 -> Sn-1:Rn-1 -> Sn :Rn->S
turn
press
approach
seize
R
Stimuli used in Hull’s experiment on
concept formation
Schedules Of Reinforcement
• Number (Ratio)
–n responses -> SR
• Time (Interval)
– First response after t seconds SR
Basic Schedules:
• Fixed Ratio (FR)
• Variable Ratio (VR)
•Fixed Interval (FI)
•Variable Interval (VI)
Skinner Box
Cumulative Record
no responses
constant rate
accelerating
Schedules of Reinforcement
Fixed Ratio
Variable Ratio
Variable Interval
Fixed Interval
Cumulative Records of Typical Schedule
Performance
Skinner’s “Theory” of
Instrumental Conditioning
• Two-term contingency: R -> SR
• Nature of reinforcer can vary: R -> S [S+R, Sr, S-R, S-r].
• 3-term contingency (Discriminative operant)
SD : R -> SR (light: bar press -> food)
S : R -> SR (no light: bar press ≠ food)
• Chaining of discriminative operants:
D
r/D
r/D
r/D
Sn-3:Rn-3  Sn-2:Rn-2  Sn-1:Rn-1  Sn:Rn 
S
•Nature of discriminative stimulus can vary:
–Exteroceptive
–Interoceptive
–proprioceptive
R
Skinner’s “Theory”
(cont.)
•Contingency of reinforcement can vary: R  S±R(r)
• Schedule of reinforcement can vary: Rn/t  S±R
– subject must emit n responses within a particular time
frame t.
• Verbal Behavior. Behavior that is reinforced by a member of
one’s verbal community.
• Private events. Discriminative responding to proprioceptive
or interoceptive stimuli (stimuli under our skin). Sd : r  Sr
or Sd : r  Sr.
Descartes:
“I think, therefore I am.”
Pascal:
“The heart has reason that
reason will never know.”
Skinner [& Freud (& Terrace)] On Consciousness
• Consciousness is a proper subject matter for psychology but it is
not an explanation of behavior. It is what has to be explained (e.g.,
Tom hit Bill because Tom felt angry).
– Why did Tom feel angry?
– How did Tom know he was angry?
• Consciousness vs. Awareness:
–Animals are aware of objects (but only fleetingly).
–Humans are conscious of objects (because they can name them).
Skinner [& Freud (& Terrace)] On Consciousness
•Consciousness develops because it enhances the social fabric of the
verbal community. It provides us with a sense of “other minds”, another
person’s hunger, pain, fear, rage, sadness, truthfulness, etc. In this
sense, consciousness is adaptive.
–Internal states are inferred by adult (“You seem hungry.”)
• Feedback about private events is not as precise as feedback for
tacting public events.
• Discriminative control of inner states (tacting) becomes autonomous
with experience.
Verbal Behavior
•Verbal Behavior. Behavior that is reinforced by a member of one’s
verbal community.
• Mands (“demands”), a 2-term contingency:
– verbal response SR [”baba”
bottle]
• Tacts - [tactus (Latin, “to point”)], a 3-term contingency:
– SD: verbal response -> Sr
– [Sight of Tom’s apple]: Mary: “May I please have an apple?”
Tom gives Mary an apple.]
Verbal Behavior (con’t.)
Examples of discriminative control of verbal behavior:
– echoic behavior:
*Mother says [“dog”]: “dog”
“good”
–textual behavior:
*Printed word [dog]: “dog”
“good”
–transcription:
*Write the word [d-o-g]: d-o-g
“good”
–intraverbal responses:
*Printed word [c-h-I-e-n]: “dog”
“bien”
*“How are you?”: “Fine thanks”
“good”
*Printed letters [Na]: “sodium” “good”
*“3 x 3”: “9”
“good”
Animal Learning Lab-200C Schermerhorn Hall
Animal Learning Lab-200C Schermerhorn Hall
Animal Learning Lab-200C Schermerhorn Hall
Problems with Classical
Conditioning
The Equipotentiality principle does not hold:
some stimuli belong together (taste + nausea), and some do not
(sound + nausea)
Learned taste aversion with long CS - US
intervals:
conditioning occurs even when the US (nausea) occurs several
hours after the CS (e.g., rabbit meat).