Download Learning

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
CLASSICAL VS. OPERANT CONDITIONING
With classical conditioning you can teach a dog
to salivate, but you cannot teach it to roll over.
Why?
Classical conditioning consists of
involuntary/automatic behaviors
Sweating, getting sick, getting nervous,
salivating
Operant conditioning consists of voluntary
behavior based on consequences
Reinforcers increase, punishers decrease
OPERANT
CONDITIONING
B.F. SKINNER
MASTERMIND
Learning based on consequence!!!
EDWARD THORNDIKE
THE LAW OF EFFECT
Behavior changes due to
consequence
Rewards leads to
reoccurrence of
behavior
Discomfort, behavior is
less likely to reoccur
REINFORCERS
ANYTHING THAT INCREASES A BEHAVIOR
Positive Reinforcement:
• The addition of something pleasant
• Sheldon trains Penny (2:45)
Negative Reinforcement:
• The removal of something unpleasant
EXAMPLES OF REINFORCEMENT
The situation
Billy: Could you tie my shoes?
Dad: (Continues to read the paper)
Billy: Dad, I need my shoes tied!
Dad: Uh, yeah, just a minute
What is reinforced?
Depends on whose
perspective you see the
situation.
 Billy’s
 Positive reinforcement:
 Gets his dad’s attention
Billy: DAAAAAD! TIE MY SHOES!!
Dad: How many times have I told
you not to whine? Now, which
shoe do we tie first?
 Dad
 Negative reinforcement:
 Eliminates Billy’s whining
TWO TYPES OF
NEGATIVE REINFORCERS
• If you hate going to class you learn
how to remove the unpleasant stimuli
• Escape Learning
• Getting kicked out of class
• Avoidance Learning
• Cutting class
POSITIVE OR NEGATIVE
REINFORCEMENT?
Putting your seatbelt on
to eliminate the beeping
noise.
Faking sick to
avoid a Psych
test.
Studying to alleviate
test anxiety
Breaking out
of jail to gain
freedom.
Taking aspirin when
you have a headache.
Negative: you would
repeat the behavior
to eliminate the pain
Getting a kiss
for doing the
dishes.
HOW DO WE ACTUALLY USE
OPERANT CONDITIONING?
Shaping is reinforcing
small steps on the way to
the desired behavior.
Chaining: performing a
number of responses
successively to get the
reward
Point of shaping is to mold a
single behavior; goal of
chaining is to link behavior
to create a complex activity
PRIMARY V. SECONDARY
REINFORCERS
Primary Reinforcer
 Things that are in
themselves rewarding.
Secondary Reinforcer
 Things we have learned
to value.
 Money is generalized
reinforcer (it can be
traded for anything)
TOKEN ECONOMY
Every time a desired
behavior is
performed, a token is
given.
Tokens are traded
for prizes/rewards
Used in homes,
prisons, mental
institutions, schools
CONTINUOUS V. PARTIAL
REINFORCEMENT
Continuous
 Reinforce the behavior
EVERYTIME the
behavior is exhibited.
 Usually done when the
subject is first
learning to make the
association.
 Acquisition comes
really fast….but so
does extinction.
Partial
 Reinforce the
behavior only SOME
of the times it is
exhibited.
 Acquisition comes
more slowly.
 But is more resistant
to extinction.
INTERMITTENT REINFORCEMENT:
RATIO SCHEDULES
Fixed Ratio
 Provides a
reinforcement after a
SET number of
responses.
For every 5 pounds I
lose, I get a manicure!
Every 3 college essays
you write, you watch 1
hour of tv
Variable Ratio
 Provides a reinforcement
after a RANDOM number
of responses.
 Gambling & lottery
 Most resistant to
extinction (hard to walk
away)
INTERMITTENT REINFORCEMENT:
INTERVAL SCHEDULES
Fixed Interval
 Requires a SET
amount of time to
pass before giving the
reinforcement.
She gets a
manicure for
every 7 days she
stays on her diet.
PAYCHECK EVERY
2 WEEKS!
Variable Interval
 Requires a RANDOM
amount of time to pass
before giving the
reinforcement.
Pop-Quiz
Randomly checking email
throughout the day
CANDY FOR HOMEWORK
 Fixed-interval
 You get candy for every 3 days you did your hw.
 Variable-interval
You get candy after 3 days, then after 4 days, then
after 2 days
 Fixed-ratio
 Every 3 attempts
 Variable-ratio
After 4 attempts, then 2 attempts (although that may
take days or weeks)
PUNISHMENT
MEANT TO DECREASE A BEHAVIOR.
Positive Punishment
 Addition of something
unpleasant
Negative Punishment
 Removal of something
pleasant
Punishment works best when
it is immediately done after
behavior!
USES AND ABUSES OF
PUNISHMENT
The wrong kinds of punishments will not work
for 4 reasons:
1. One getting punished will discriminate
environments
What you get punished for at home, you may
not get caught at school
2. Physical punishment increases
aggressiveness (modeling)
 How would you solve a problem at school if
you see hitting at home?
USES AND ABUSES OF
PUNISHMENT
3. Punishment trigger fear
Why tell the truth if I know what’s coming…
4. Punishment is often applied
unequally and doesn’t address
the behavior.
 Damaged my tree and he missed hockey
MAKING PUNISHMENT WORK
To make punishment work it should be…
must be given immediately
limited in time & intensity.
clearly target the behavior, not the person
The most effective punishment is often
negative punishment.
IT’S ALL IN THE WAY YOU PHRASE IT
Instead of:
“Clean your room or you do not get dinner ”
Try:
“You’re welcome to join us for dinner once your room is
clean”
What punishment often teaches is how to avoid it.
Premack Principle: Using a preferred activity to reinforce an
activity that is not preferred
You love twitter, but hate homework
When you finish your homework, you can play on twitter