Download Reinforcement - wbphillipskhs

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychological behaviorism wikipedia , lookup

Classical conditioning wikipedia , lookup

Behavior analysis of child development wikipedia , lookup

Behaviorism wikipedia , lookup

Insufficient justification wikipedia , lookup

Operant conditioning wikipedia , lookup

Transcript
I CAN
• Explain key features of OC
– Positive Reinforcement
– Negative Reinforcement
– Omission Training (Negative Punishment)
– Punishment (positive punishment)
• Distinguish the Schedules of Reinforcement
Behaviorist
Believe infants are born with only three
instinctive responses
1. Fear
2. Rage
3. Love
All others behaviors are developed during life
through learning
Copyright © Allyn & Bacon 2007
Four Kinds of Consequences
STIMULUS
Positive or
appetitive
+
Present
Remove
Positive
Reinforcement
Bonus for working hard
leads to more hard work
Negative
Reinforcement
Aspirin curing headache
causes more aspirin use
Copyright © Allyn & Bacon 2007
Negative or
aversive
Punishment
Getting speeding ticket
leads to less speeding
Omission Training
Missing dinner leads to
less staying out late
Why Punishment Doesn’t Work
1.The power of punishment usually disappears when
threat of punishment is removed
Punishment…
2. …often triggers aggression or escape
3. …may increase apprehension in the learner,
inhibiting the learning new and better
responses
4. …is often unfair and applied unequally
Copyright © Allyn & Bacon 2007
When Does Punishment Work?
•
•
•
•
It must be immediate
It must be certain and consistent
It should be limited in duration and intensity
Should be clearly target the behavior, not the
person
• Limited to the situation in which the response
occurred
• Should not send mixed messages (I can hit you
but you can’t hit others
• Negative punishment is the most effective
Copyright © Allyn & Bacon 2007
Alternatives to Punishment
• Extinction
• Reinforcing preferred activities
–The Premack Principle
• Prompting and shaping
Copyright © Allyn & Bacon 2007
The Skinner Box
An Operant Chamber…The Skinner Box
A testing device programmed to deliver reinforcers and punishers
Copyright © Allyn & Bacon 2007
dependent upon an animal’s
behavior
Copyright © Allyn & Bacon 2007
Primary Reinforcers
• Reinforcers that have an
innate basis because of their
biological value to an
organism
•
•
•
•
•
Food
Sleep
Sex
Air
Water
Copyright © Allyn & Bacon 2007
Secondary Reinforcers
•
Stimuli that acquire their
reinforcing power by their
learned association with
primary reinforcers
•
•
•
•
•
•
Virtually any stimulus can
become a secondary
reinforcer
Money
Awards
Praise
Grades
Success
Power
Copyright © Allyn & Bacon 2007
Premack Principle
• The concept that a preferred activity can be
used to reinforce a less preferred one
• Example: A teacher lets kids run around
(preferred activity) to reinforce a less
preferred one (sitting still and listening)
Copyright © Allyn & Bacon 2007
Reinforcement
Continuous Reinforcement
A reinforcement schedule in which all correct
responses are reinforced
Possible Problems:
1. Correct responses
can be
missed,
causing confusion
2. Typically loses its reinforcing quality
Copyright © Allyn & Bacon 2007
Reinforcement
Intermittent (or Partial) Reinforcement
A reinforcement schedule in which some, but
not all, correct responses are reinforced
Resistant to extinction
Copyright © Allyn & Bacon 2007
Reinforcement
• Extinction
In operant conditioning, a process by which
a response that has been learned is
weakened by the absence or removal of
reinforcement
How does this differ from extinction in
classical conditioning?
Copyright © Allyn & Bacon 2007
Extinction
Operant Conditioning
• A learned response is
weakened by the removal
or absence of
reinforcement
• A. If a child has learned
that if it cries it will get a
toy, withhold the toy
• B. A child cries for
attention, simply ignore
the child until the crying
stops
Classical Conditioning
• The CR (dog salivating) is
eliminated by repeated
presentations of the CS
(bell/tone) without the
UCS (food)
• A reversal of a learned
response by withholding
the UCS
Copyright © Allyn & Bacon 2007
Shaping
Technique where
responses similar to
desired response are
reinforced
Example:
Getting a scared child to
slide down a high slide
Begin at the bottom, and
gradually go higher up
the slide with each turn
until the child is at the
top.
Copyright © Allyn & Bacon 2007
Copyright © Allyn & Bacon 2007
Schedules of Reinforcement
• 1. Ratio Schedules
Provide a reward after a certain number of
responses (Ratio = number)
• 2. Interval Schedules
Provide reward after a certain time interval
Fixed Ratio (FR)
Variable Ratio (VR)
Fixed Interval (FI)
Variable Interval (VI)
Copyright © Allyn & Bacon 2007
Copyright © Allyn & Bacon 2007
Schedules of Reinforcement
Fixed Ratio (FR)
Variable Ratio
(VR)
Fixed Interval
(FI)
Rewards appear after a
certain set number of
responses
Example: A factory
workers gets paid after
every 10 cases of a
product are completed
Variable Interval
(VI)
Copyright © Allyn & Bacon 2007
Schedules of Reinforcement
Fixed Ratio (FR)
Variable Ratio
(VR)
Fixed Interval
(FI)
Variable Interval
(VI)
The number of
responses for a reward
(reinforcement) varies
Example:
Telemarketers never
know how many calls it
takes to make a sale
slot machine pay-offs
Copyright © Allyn & Bacon 2007
Schedules of Reinforcement
Fixed Ratio (FR)
Variable Ratio
(VR)
Fixed Interval
(FI)
Variable Interval
(VI)
Time period between
rewards remains
constant
Example:
Weekly paycheck
Quarterly school grades
Copyright © Allyn & Bacon 2007
Schedules of Reinforcement
Fixed Ratio (FR)
Variable Ratio
(VR)
Fixed Interval
(FI)
Variable Interval
(VI)
Rewards appear after a
certain amount of time,
but that amount varies
Example:
Random visits from the
boss who delivers praise
Fishing
Copyright © Allyn & Bacon 2007
Operant and Classical Conditioning
Compared
• Classical Conditioning
involves the association of
two stimuli (UCS + CS) before
the response or behavior
• It is largely a response to
past stimulation and ends
with the response
• Operant Conditioning
involves a reinforcing
(reward) or punishing
stimulus after a response or
behavior
• Is directed at attaining some
future reinforcement or
avoiding punishment and
requires a stimulus that
follows the response
Copyright © Allyn & Bacon 2007
CAN I?
• Explain key features of OC
– Positive Reinforcement
– Negative Reinforcement
– Omission Training (Negative Punishment)
– Punishment (positive punishment)
• Distinguish the Schedules of Reinforcement