Download Classical Conditioning - UCI Cognitive Science Experiments

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Learning Part II
Overview
• Habituation
• Classical conditioning
• Instrumental/operant conditioning
• Observational learning
Thorndike and Law of Effect
• Classical Conditioning considers only involuntary
reflexes. How are voluntary responses learned?
• Thorndike proposed the Law of Effect
– If a response (behavior) is not rewarded, it will be
weakened
– If a response (behavior) is rewarded, it will be
strengthened
Edward Thorndike
1874-1949
Video: Thorndike’s puzzle box (~2 min.)
(for a copy of this video, see: http://www.youtube.com/watch?v=BDujDOLre-8)
Thorndike’s results: gradual learning
Learning curves demonstrate
that learning is gradual and
incremental. There is no
evidence that the cats have a
sudden insight into the
problem’s solution.
Successive trials in puzzle box
Instrumental Learning
Operant Conditioning
Skinner developed the operant chamber
(“Skinner box”) with a bar or key that an
animal manipulates to obtain a food or water
reinforcer.
B.F Skinner
1904-1990
Video: examples of operant conditioning (~1 min.)
In Classical Conditioning, a response is elicited
by the US and CS.
The response is involuntary and has no effect on
the external environment.
The association is between the CS and US.
In Operant Conditioning, a response is
emitted.
The response is voluntary and is referred to as
an operant, behavior that brings about some
change in one’s environment.
The association is between the response and
the reinforcement.
Video: discrimination in operant conditioning and
schedules of reinforcement (~3 min)
http://www.youtube.com/watch?v=I_ctJqjlrHA
Types of Reinforcement Schedules
• Interval schedules:
– Fixed Interval: Reinforcer is only available only after
some fixed time after the last reward
– Variable Interval: Same as fixed interval, except that
the time between available reinforcers is varied.
• Ratio schedules:
– Fixed Ratio: Reinforcer is presented after a fixed
number of responses.
– Variable Ratio: The number of responses needed for
a reinforcer varies.
Cumulative
number of
responses
What kind of reinforcement schedule?
• Reprimanding a child if you have to ask
him to clean his room three times
Fixed ratio
• Getting a raise every two years
Fixed interval
• Playing a lottery game
Variable ratio
• Your boss checks your work periodically
but you do not know when she might
come in next time
Variable interval
Shaping:
A desired behavior, even if complex, can be obtained with an
operant training method known as successive approximations.
Shaping stretching up
Pigeons learning to play a game
Explaining Superstitious Behavior
• In one experiment, Skinner (1948) delivered food every
15 seconds to pigeons in a Skinner box
– Result: some birds engaged in odd idiosyncratic
behavior, pecking aimlessly in a corner or walking in
circles
– Pigeons might have learned an accidental correlation
– they just accidentally associated a random behavior
to the delivery of food
• Could this explain superstitious behavior in humans?
Contingency in Operant Conditioning
Reward only appears to work if the animal has some
apparent control over when the reward is delivered.
Contingency and learned helplessness
If a dog is first given shocks that it cannot control, it
will take no action to escape shocks presented in a
new situation where escape is possible. The
phenomenon has been described as learned
helplessness.
Phase 1
Phase 2
(Seligman, 1975)
What make reinforcers?
• Primary reinforcers
– meet primary needs: food, water, warmth
• Secondary reinforcers
– money, tokens, grades
• Social reinforcers
– Hugs, smiles, words of approval, even attention
– Chimpanzees, in studies like that of Butler (1954) will
press a bar to get a glimpse of the experimenter.
• Sometimes, there appears to be no reinforcer and
behavior might be driven by intrinsic motivation
Video: an example of learned behavior -temper tantrums (~1.5 min.)
http://www.youtube.com/watch?v=KpSfThUv_pc
Problems for Behaviorist Theories
• Learning without reinforcement
– mental representation
• Biological predispositions
– one-trial learning
– limitations on stimulus-response associations
• Observational learning
Edward C. Tolman
1886-1959
Cognitive Behaviorist
Findings imply that the rats
learned a cognitive map of the
maze without any external
reward: latent learning
More evidence for cognitive maps
Rats quickly mastered task (A)
Rats were placed in new maze (B) where
main straightaway was blocked. Amazingly,
they don’t choose 9 or 10 (a generalization
strategy) but 5, which is in the direction
where the food was previously
FOOD
START
START
Biological Predispositions
• Do the laws of learning of classical and operant
conditioning really apply equally well to all types of
animals and all types of stimuli?
• Species specific learning:
– Birds easily associate illness with visual cues (e.g.,
color of food), but not with taste
– Rats easily associate illness with taste, but not with
visual cues
Specificity of Taste Aversion
(Garcia & Koelling, 1966)
Implications
• The behaviorists held that general laws of learning
shape the behavior of all animals, regardless of a
particular creature's evolutionary history or biological
makeup
• Garcia’s findings suggest that animals are "biased
learning machines" designed by evolutionary forces to
forge meaningful links between some stimuli but not
others
Observational Learning
• Many animals can learn simply by example, without
direct reinforcement
• vicarious conditioning
• imitation
• Observation learning can occur after one exposure
• Imitation can be a source of undesired behaviors (Bobo
Doll experiment) as well as a source of new skills.
Video: Bobo Doll Experiment
(Bandura, 1969; ~2 min)
See also this article:
http://www.nytimes.com/2013/08/25/opinion/sunday/does-mediaviolence-lead-to-the-real-thing.html?hp&_r=0
For a similar video see: http://www.youtube.com/watch?v=hHHdovKHDNU&feature=related
Negative observational learning
• Evidence that exposure to media violence is associated
with aggressive behavior in children
– Challenge is to distinguish between correlation and causation
OTHER
Reinforcement Learning Tie Ins
• https://www.youtube.com/watch?v=iNL5-0_T1D0
Interesting Article
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
In a First, Experiment Links Brains of Two Rats
By JAMES GORMAN
Published: February 28, 2013
In an experiment that sounds straight out of a science fiction movie, a Duke neuroscientist has connected the brains of two rats in such a way that when one moves to
press a lever, the other one does, too — most of the time.
Connect With Us on Social Media
@nytimesscienceon Twitter.
Science Reporters and Editors on Twitter
Like the science desk on Facebook.
The neuroscientist, Miguel Nicolelis, known for successfully demonstrating brain-machine connections, like the one in which a monkey controlled a robotic arm with its
thoughts, said this was the first time one animal’s brain had been linked to another.
The question, he said, was: “Could we fool the brain? Could we make the brain process signals from another body?” The answer, he said, was yes.
He and other scientists at Duke, and in Brazil, published the results of the experiment in the journal Scientific Reports. The work received mixed reviews from other
scientists, ranging from “amazing” to “very simplistic.”
Much of Dr. Nicolelis’s work is directed toward creating a full exoskeleton that a paralyzed person could operate with brain signals. Although this experiment is not directly
related, he said, it helps refine the ability to read and translate brain signals, an important part of all prosthetic devices connected to the brain, and an area in which brain
science is making great advances.
He also speculated about the future possibility of a biological computer, in which numerous brains are connected, and views this as a small step in that direction.
The experiment involved extensive training for both rats, with water as a reward. One, the so-called encoder rat, learned to press one of two levers, left or right, in
response to a light signal over the correct lever.
The second, or decoder rat, also learned to press either the left or right lever in response to light, but then went on to respond instead to brain stimulation from his rat
partner.
For the experiment, recording electrodes were implanted in the primary motor cortex of the encoder rat and stimulating electrodes in the same area in the decoder rat.
Then, as the encoder responded to the light appearing over one lever or the other, its pattern of brain activity was sent to a computer, which simplified the pattern for
transmission to the decoder rat. The signal received by the decoder was not the same as the stimulation it had previously received in training, Dr. Nicolelis said.
Seven out of 10 times, the decoder rat pressed the right lever.
The researchers reported similar results in other experiments, based on whether the rats sensed a narrow or wide opening with their whiskers. In this case the electrodes
were implanted in a different part of the brain, where sensory signals are received.
Ron D. Frostig, a neuroscientist at the University of California, Irvine, said, “I think it’s an amazing paper.” He described it as a “beautiful proof of principle” that information
could be transferred from one brain to another in real time — not by mind-reading or telepathy, but a transfer of what might be called the impulse to act.
Andrew B. Schwartz, a neuroscientist at the University of Pittsburgh, was less impressed. He described the work as “very simplistic” and pointed out that the rat receiving
the signal pushed the right lever only 7 out of 10 times and would have done so 5 out of 10 times by chance.
There was an additional twist to the research. Dr. Nicolelis added a touch of international drama by locating one rat at Duke, in North Carolina, and another in Natal,
Brazil. Similarly, in his earlier work, he had a monkey in North Carolina operate a robotic arm in Japan.
The distance does not change the essential science, but adds some difficulty to the experiment, because the signals sent from one brain to the other had to go through an
Internet connection.
This article has been revised to reflect the following correction:
Correction: February 28, 2013
An earlier version of this article misstated the university where Ron D. Frostig works. It is the University of California, Irvine, not the University of California, Davis.
Connecting Two Rat Brains
• http://www.youtube.com/watch?v=y5bdGH-0wF8
• http://www.youtube.com/watch?v=nAdgxiwoHME
• http://www.youtube.com/watch?v=F4ImQ4qUJ5k
Movie doesn’t play? BraintoBrainRat.mov
Mirror Neurons
Acquiring Knowledge
• Learning involves more than a change in behavior; it
also involves the acquisition of new knowledge: latent
learning