Download Chapter 6 Learning powerpoints

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Attribution (psychology) wikipedia , lookup

Theory of planned behavior wikipedia , lookup

Neuroeconomics wikipedia , lookup

Theory of reasoned action wikipedia , lookup

Applied behavior analysis wikipedia , lookup

Adherence management coaching wikipedia , lookup

Perceptual control theory wikipedia , lookup

Verbal Behavior wikipedia , lookup

Insufficient justification wikipedia , lookup

Learning theory (education) wikipedia , lookup

Behavior analysis of child development wikipedia , lookup

Social cognitive theory wikipedia , lookup

Psychophysics wikipedia , lookup

Behaviorism wikipedia , lookup

Psychological behaviorism wikipedia , lookup

Classical conditioning wikipedia , lookup

Operant conditioning wikipedia , lookup

Transcript
Learning
A relatively permanent change in
knowledge or behavior that results
from experience.
Not Learning: Instincts and Reflexes
• Some behavioral responses are instinctual or
reflexive and do not reflect learning
• Fixed Action Pattern (instinct)
– A behavior that is built into an animal’s nervous system
and is triggered by a specific stimulus, even the first time
it is encountered
– All members of a particular species display the behavior
(I.e., the behavior is species-specific)
• Reflex
– An automatic response to external stimulation
– Many behavioral reflexes in infants quickly disappear
Simple (Nonassociative) Learning:
Habituation and sensitization
• Habituation
– A primitive form of learning in which a stimulus
becomes familiar to an organism after repeated exposure
and the organism stops responding
• Sensitization
– Another primitive form of learning in which a noxious
stimulus, such as an electrical shock to the tail of the
Aplysia, increases the response to a gentle touch of
another part of the snail’s body
• Both of these forms of simple learning are caused by
changes in the amount of neurotransmitter released
by the sensory neuron that is stimulated
Classical conditioning:
One form of associative learning
• Classical conditioning: A learning process in which a
previously neutral stimulus becomes associated with
another stimulus (and the response that stimulus elicits)
through repeated pairing with that stimulus; associative
learning
• Before Stimuli Are Paired
– Unconditioned Stimulus (US) elicits Unconditioned
Response (UR)
• Meat powder (US) leads to salivation (UR)
– Neutral stimulus (NS) elicits no particular response
• Bell (NS) leads to orienting response only
Pavlov’s Apparatus
• Harness and fistula (mouth tube) help keep dog in a
consistent position and gather uncontaminated saliva
samples
– They do not cause the dog discomfort
During and After Conditioning
• Conditioning Trials: Neutral Stimulus is Paired with
the Unconditioned Stimulus
– Bell (NS) rings, then meat powder (US) is delivered
– Each time this happens is called a conditioning “trial”
– Repeated several times
• After Several Trials
– When bell rings, dog salivates
– The bell is now a Conditioned Stimulus (CS)
– Salivation is now a Conditioned Response (CR)
Acquisition, Extinction, and
Spontaneous recovery
Important classical conditioning principles
• When should the CS be presented?
– Forward pairing, in which the CS precedes the US, is the most
effective type of conditioning
• Higher (or second) order conditioning
– If an additional neutral stimulus is paired with the CS, it too can be
conditioned and become a CS
• Stimulus generalization
– The tendency to respond in a conditioned manner to a stimulus that
is similar to the CS
• Stimulus discrimination
– The complement of generalization
– The ability to not respond to stimuli similar to the CS that have
been distinguished from the CS (perhaps through extinction)
Conditioned Emotional Responses
• A learned association between a neutral stimulus
and a stimulus that evokes an emotional response
• John Watson conditioned an 11-month old boy –
named “Albert” – to fear a white laboratory rat
– Each time he reached for the rat, Watson made a loud
clanging noise right behind Albert
• Albert’s fear generalized to anything white and furry
– Including rabbits and Santa Claus
• Phobia: A fearful, anxious, irrational reaction to an
object or event that often leads to avoidance
– Can be treated using classical conditioning principles
Conditioned Taste Aversion
• If a flavor is followed by an illness experience, the
organism will not consume the flavor in the future
• John Garcia conditioned wolves to avoid the taste of
sheep
– Wolves were fed mutton that was laced with a chemical
(lithium chloride) that caused nausea
– Later, when a wolf attacked a sheep, it stopped its attack
as soon as it experienced the taste of mutton
• Before conditioning, the mutton is the neutral
stimulus and lithium chloride is the US, because the
UR (nausea) is a reflexive response to lithium
chloride, not mutton
Conditioned Immune Responses
• In rats, sugar water was paired with a drug that weakened
the immune system and later, the sugar water alone
weakened the immune system
• In humans, chemotherapy weakens the immune system and,
after being paired with the hospital in which the
chemotherapy takes place, just entering the hospital
weakened the immune system
• In rats, when sugar water was paired with a bacterial agent
that increased immune system response, later the sugar
water alone increased immune system response
• In humans, when sherbet ice cream (or another neutral
stimulus) is paired with a shot of adrenalin (which increases
the immune response), later the sherbet ice cream alone will
increase the immune response.
Biological Preparedness
• All animals are biologically programmed to learn some
associations more easily than others.
• John Garcia paired light, sound, and taste stimuli
presented simultaneously to rats with
– A dose of X-rays that caused nausea hours later
– A painful shock applied to the foot
Contiguity vs. Predictability
• Classical Conditioning
may be learning that one
event predicts another
• Top graph: The US does
not happen without the
CS
– Good learning here
– Organism is not anxious
when tone is not sounded
• Bottom graph: The US
happens with or without
CS
– Poor learning here
– Organism is in a constant
state of anxiety
Rescorla-Wagner Model
• When the occurrence of the US is a surprise, there is greater
classical conditioning to the CS.
– This is because the organism will seek out a stimulus (a NS) that
predicts the occurrence of the US and associate that stimulus (now
a CS) with the US
• Novel stimuli are more easily associated with a US than are
familiar stimuli
– In second-order conditioning, blocking sometimes occurs
• Blocking is the failure of a stimulus to become conditioned because there is
already a (familiar) stimulus acting as the CS
– Latent inhibition
• An often-presented (familiar) neutral stimulus is unlikely to be associated
with a US after just a single trial because of the history of the organism’s
experience
The Law of Effect
• E. L. Thorndike put hungry cats into “puzzle boxes”
in which a lever would open the door to food that
was visible outside of the box
• Time to escape decreased from 3 minutes to 1
minute over 25 attempts
• Behaviors that worked to escape were repeated
while other behaviors (scratching bars, sniffing
corners) decreased
• The Law of Effect: Behavior is controlled by it’s
consequences
Operant Conditioning:
A second form of associative learning
• Operant conditioning: Behaviors (operants) operate
on the environment and will either increase or
decrease depending on the consequences they elicit
Pleasurable
Stimulus
Aversive
Stimulus
Introduce
Remove
Positive
Reinforcement:
Increases
behavior
Negative
Punishment:
Decreases
behavior
Positive
Punishment:
Decreases
behavior
Negative
Reinforcement:
Increases
behavior
Examples
• Positive reinforcement
– Introduction of a reward (money, hug, smile, etc.)
• Negative reinforcement
– Removal of an aversive stimulus
• Taking aspirin removes the headache, which increases the
likelihood of taking aspirin in the future
• Positive punishment
– Introduction of an aversive stimulus
• Spanking, scolding; in experiments: shock
• Negative punishment
– Removal of a positive stimulus
• Taking away the keys to the car
• Timeouts
Perils of Punishment
• Learner may not understand which operant is being
punished
• Learner may come to fear the teacher (a conditioned
emotional response), rather than learn the association
between action and punishment, and then avoids the teacher
• Punishment may not undo existing rewards for a behavior
– As a result, the behavior may be hidden, but not extinguished
– Punished behavior needs to be replaced
• The punishment might be levied when the teacher is angry
– As a result, the punishment might not be appropriate
• Punitive punishment leads to future aggression (modeling)
• Neglected children may enjoy the attention, even when it’s
negative
Partial Reinforcement Effect
• Partial Reinforcement Effect
– The tendency for a schedule of partial reinforcement to
strengthen later resistance to extinction.
– After partial reinforcement, the subject is already familiar
with the fact that every operant is not followed by a
reinforcer
Schedules of Reinforcement
• Simple
reinforcement
schedules produce
characteristic
response patterns
• Steeper lines mean
higher response
rates
• Ratio schedules
produce higher
response rates than
interval schedules
Other important operant conditioning principles
• Shaping is a procedure that produces novel behaviors by
reinforcing closer and closer approximations to the desired
behavior
• Discriminative stimuli signal the fact that the consequences
of our behavior often vary from one situation to the next
– One of the keys to the flexibility and complexity of behavior
• People give and receive reinforcement and punishment in
nearly all of their interactions
– Parent punishes child by sending him to his room. Child negatively
reinforces the parental act of punishment if he no longer acts out.
• Primary vs. secondary reinforcers
– Primary: satisfy biological needs
– Secondary: are associated with primary reinforcers
• e.g., money, grades, etc.
Practical Applications of Operant Conditioning
• Clinical
– Biofeedback
– Cognitive-behavioral therapy
• Workplace
• School
– Computer-assisted instruction
– Token systems
• But must take care when rewarding an intrinsically
enjoyable task
Biofeedback & Tension Headaches
• Sensors on the head detect muscle
activity and the system converts signal
to visual display
• Patient watches the display, tries to
reduce tension, and thereby tries to
reduce activity on the display
– If the display shows reduced activity
(because the patient has reduced
tension in their mind/body), this acts
like a reinforcer
• Like getting a good grade after
studying hard
– If the display shows increased activity
(because the patient is experiencing
increased tension in their mind/body),
this is acts like a punishment
• Like getting a poor grade after not
studying
– Over time, patients will learn how to
reduce tension in the mind/body
• In this way, the use of biofeedback is a
very successful practical application of
operant conditioning principles
The difference between classical
and operant conditioning
• In classical conditioning, the organism’s behavior is
reflexive and it learns a relationship between two
stimuli that precede it.
• In operant conditioning, the organism learns a
relationship between a voluntary behavior and the
consequence of that behavior, which of course
occurs after the behavior.
The traditional view of conditioning . . .
• Organisms learn specific responses to specific
stimuli.
• If there is anything internal to the organism that has
an effect on behavior, these effects can be more
simply explained by appealing to the principles by
which stimuli are associated with each other or with
responses.
• S-R psychology
Is supplemented by a cognitive
perspective on conditioning
• But researchers discovered several situations in
which the traditional view was an insufficient
explanation of behavior
• These observations indicated that psychologists
needed to consider stuff (motivation, cognition,
beliefs, expectations) that is internal to the organism
– Motivational perspective on learning
– Cognitive perspective on learning
• Latent learning of cognitive maps, locus of control, and learned
helplessness
– Observational learning
A Motivational Perspective on Learning
• Drives are internal stimulation that create an unpleasant
state of tension. In turn, this tension impels the organism
into activity in order to reduce the tension. Drive strength
determines the vigor and persistence of that activity.
– Hunger, Thirst, Sex, Aggression
• Drive reduction theory: A reinforcer is reinforcing because
it reduces drives.
– If a rat is hungry (primary drive) and it receives food (primary
reinforcer) as a result of a certain action, then hunger is reduced
and the behavior increases.
– If a person needs money (a secondary drive) and they receive
money (secondary reinforcer) as a result of a certain action, then
the need for money is reduced and the behavior increases.
Motivational Perspective on Learning cont.
• Escape learning
– Organisms can learn to make a response that terminates
an aversive stimulus (negative reinforcement)
• Avoidance learning
– Organisms can learn to make a response in order to
prevent an aversive stimulus from even starting to make
the organism feel discomfort.
• Where’s the negative reinforcement in avoidance
learning?
– Avoidance is escape from fear.
• An internal state is an important factor in learning.
Latent Learning and Cognitive Maps
• Rats: one maze trial/day
• One group found food
every time (red line)
• Second group never
found food (blue line)
• Third group found food
on Day 11 (green line)
– Sudden change, day 12
• The essence of
conditioning is not the
modification of behavior,
it’s the modification of
knowledge.
Locus of control
• Behavior is not so much influenced by the actual
relationship between behavior and it’s consequences
as it is by the subjective belief about that
relationship, called an expectancy.
• Generalized expectancies are a dimension of one’s
personality.
– Internal locus of control: I am the master of my own
destiny.
• If I study harder I can influence my grades.
– External locus of control: My fate is determined by
forces outside of my control.
• My grade will be arbitrarily determined by the questions
that are selected on the exam.
Learned helplessness
• Dogs are placed in a harness and then shocked.
• Half the dogs can escape from the shock; half cannot.
• After several trials, the dogs that can escape the shock learn
to do so very rapidly (escape learning), while those that
cannot do not even attempt to escape.
• Then the dogs from both groups are placed in a “shuttle
box” where they can escape a shock by jumping over a
barrier.
• The dogs that had been in the inescapable harness do not
attempt to escape.
– They have learned to be helpless, even in their new situation.
– They have an expectancy that nothing they do can help them
escape the shock.
– This is an animal model for depression in humans.
Observational Learning I
• Observational Learning
– Learning that occurs when one observes the behavior of
others
• Albert Bandura conducted important experiments showing that
nursery school students would behave more aggressively when
they observed an aggressive adult.
• Monkeys raised in a laboratory that were unafraid of snakes
observed the fear response of wild monkeys towards snakes and
then became fearful of snakes themselves (Mineka et al., 1984).
– Vicarious conditioning
• Learning the consequences of a behavior by observing the
consequences of somebody else’s behavior
• Even though an organism may not behave in a manner that it
has seen punished, this does not mean that the organism has not
learned (acquired) the behavior
Observational Learning II
• Modeling
– A common term used to describe one organism’s imitation of the
behavior of another organism (i.e., the model)
– Such imitation is often unconscious.
– Mirror neurons
• When an organism performs an action, these special neurons become
activated, even though they are not themselves responsible for the
performance of the action.
• When a second organism watches the first organism perform the action, the
analogous set of mirror neurons in the second organism becomes activated,
even though the second organism does not actually perform the action.
• Mirror neurons may be responsible for storing an action plan
– When the organism performs the action, mirror neurons may direct other
neurons that are directly responsible for the action
» The action plan stored in mirror neurons may also used to provide
feedback to the neurons that are directly responsible for the action about
possible discrepancies between the action plan and the performed action
Biological Mechanisms of Reward
• Nucleus accumbens
– Subcortical brain structure that is part of the limbic
system
– The pleasure center
• When stimulated, the organism feels pleasure
– Responsible for the pleasure that you feel when you eat,
drink, engage in sexual behaviors, use recreational drugs,
and even when you receive a secondary reinforcer (such
as money)
• Dopamine is the principal neurotransmitter in the
nucleus accumbens
– Drugs that block the effects of dopamine disrupt operant
conditioning
The neuronal basis of learning
• Donald Hebb: Learning results from alterations in synaptic
connections
– Cells that fire together, wire together.
• When the firing of one neuron excites another neuron, the synapse becomes
strengthened such that there is an increase in the likelihood that when the
first neuron fires again, so will the second neuron.
– Also known as long-term potentiation (LTP)
• Artificial stimulation of a first neuron increases the likelihood that further
stimulation of that neuron will activate a second neuron
• Caused by increased sensitivity of the post-synaptic neuron to the
stimulation
– Compare with habituation and sensitization, where the amount of
neurotransmitter released into the synapse changes.
– NMDA receptors
• Special receptors that open only if two neurons fire at the same time
– Like Hebb argued
• Required for LTP
• Genetically altering these receptors in mice led to super smart “Doogie
mice”
Computerized Neural Networks
• Computer models of learning
in which individual pieces of
information can be
graphically represented by
nodes (see Figure 6.24)
• These models can behave
like human beings
• They emphasize
strengthening connections
between nodes that are used
when a particular stimulus is
input and weakening
connections that are not used
– Like Hebb argued
Pandemonium Model
Interactive Activation Model