Download Chapter 5

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Normality (behavior) wikipedia , lookup

Father absence wikipedia , lookup

Human male sexuality wikipedia , lookup

Psychophysics wikipedia , lookup

Learning theory (education) wikipedia , lookup

Behaviorism wikipedia , lookup

Learning wikipedia , lookup

Psychological behaviorism wikipedia , lookup

Classical conditioning wikipedia , lookup

Operant conditioning wikipedia , lookup

Transcript
Chapter 5
Learning and Behaviour
Slide 1
The Big Picture
This chapter focuses on the manner in which we learn to
behave in certain ways given certain environmental
conditions
The emphasis will be primarily on stimulus-response
mappings, and how they are formed
There will be very little discussion of cognitive states
or processes … which contrasts quite strongly with the
methods that are more popular today
Chapter 5 - Learning and Behaviour
Slide 2
The Starting Place - UCSs & UCRs
We come equipped with many stimulus response mappings
that simply reflect our machinery in action … for examples:
> When we put food in our mouths, digestive processes
are initiated
> If a projectile is coming at our face we close our eyes,
duck our heads, raise our hands, and sometimes hold
our breath
These associations are the produce of evolution (or creation)
and the components of them are labeled as unconditioned
stimuli (UCS) and unconditioned responses (UCR)
> Food (UCS) -> Digestive Process (UCR)
Chapter 5 - Learning and Behaviour
Slide 3
Habituation - Weakening the SR
Mapping
The occurrence of some novel stimulus in the environment
(UCS) tends to lead to a startle response (UCR).
However, if the stimulus occurs repeatedly without any
positive of negative consequence, the startle response stops
occurring.
This is a process called habituation … as examples of it
consider:
(1) Those weird house noises you no longer hear
(2) Airplanes at my old place
Basically, if the UCR proves itself unnecessary in the
presence of some UCS … the UCR may occur less and less
Chapter 5 - Learning and Behaviour
Slide 4
Classical Conditioning - The Extension
of SR Mappings to New Stimuli
In 1904, a Russian scientist named Ivan Pavlov
stumbled across an interesting phenomenon while
studying how the canine digestive system worked.
This phenomenon has come to be called classical
conditioning, and it explains how new stimuli can come
to be associated with certain behavioural responses.
Pavlov’s is now known as one of the most influential
figures in psychology, and his experiments helped to
start the wave of behaviourism that ruled psychology
for many years.
Chapter 5 - Learning and Behaviour
Slide 5
Pavlov’s Experiment - Baseline
At the beginning of the experiment, if a bell was rung
near the dog it did not salivate.
Chapter 5 - Learning and Behaviour
Slide 6
Pavlov’s Experiment - Baseline
However, if food (UCS) was presented to the dog, it
would salivate (UCR)
UCS
UCR
Chapter 5 - Learning and Behaviour
Slide 7
Pavlov’s Experiment - Conditioning
Over a number of trials, the bell the CS or conditioned
stimulus is rung just before the food is delivered
CS
UCS
UCR
Chapter 5 - Learning and Behaviour
Slide 8
Pavlov’s Experiment - Testing
After a number of conditioning trials, if the CS is
presented alone, it will typically lead to a conditioned
response … which is similar in form, if not degree, to
the unconditioned stimulus
CS
CR
Chapter 5 - Learning and Behaviour
Slide 9
Classical Conditioning Overview
In order for classical conditioning to be effective, the
UCS must reliably follow the CS.
If this association is not strong throughout the conditioning
phase, the learning will be weak.
If the association between the CS and UCS is terminated
after conditioning … the CR will eventually not occur
in response to the CS - a process called extinction.
However, an extinct CS-CR mapping can become active
again quickly if the CS/UCS association again becomes
strong again - a process called spontaneous recovery.
Chapter 5 - Learning and Behaviour
Slide 10
Operant Conditioning
The term “operant” refers to the notion that humans learn
from operating on their environment. We behave, then
note the consequences and use them to modulate future
behaviour.
The famous cat torturer Edward Thorndike was one of the
first to study operant conditioning. Early on, his research
focused on “learning by trial and accidental success”
Through this, he formed the Law of Effect which states that
a behaviour that is followed by a positive consequence will
tend to be repeated --- note similarity to evolution theory.
Chapter 5 - Learning and Behaviour
Slide 11
Behavior Analysis & B. F. Skinner
Skinner strongly championed the experimental study
of the Law of Effect, and he made strong claims to its
application to human behaviour -- Walden Two
He invented a number of devises for studying operant
conditioning … the most famous being the operant
chamber or “Skinner Box”
This device allows the experimenter to control a number
of environmental stimuli, and allows him to deliver both
rewards (most typical) or punishment.
Chapter 5 - Learning and Behaviour
Slide 12
Basic Skinner Box
Lights
Speaker
Lever
Floor the can be
electrified (punishment)
Chapter 5 - Learning and Behaviour
Food Hopper
(reward)
Slide 13
Measuring Behaviour
Behaviour is often measured in terms of rate of responding
(i.e., number of responses within some period of time)
Skinner came up with a response recorder apparatus that
allowed him to record each response over time. This device
is called a cumulative recorder because it keeps track of the
total number of responses over time.
Thus, the effects of variables on the response rate could be
measured allowing one to see if certain variables strengthen
(i.e. increase) the response of interest, or weaken (i.e.,
decrease) the response of interest.
Chapter 5 - Learning and Behaviour
Slide 14
Graph from a Cumulative Recorder
70
Number of Responses
60
Reward each
response
Reward each 2nd
response
50
40
30
20
10
0
1
2
3
4
5
6
7
8
9
10
11
12
Time (Minutes)
Chapter 5 - Learning and Behaviour
Slide 15
Notion of a Three-Term Contingency
Skinner described any behavioural event in terms of
three parts:
(1) The preceding event, which usual involves the
presentation of a discriminative stimulus
(2) The behavioural response to the discriminative
stimulus
(3) The following event, which represents the consequence
of our behaviour
e.g., training killer whales at Sea World
Chapter 5 - Learning and Behaviour
Slide 16
Ways of Altering Behaviour
Positive Reinforcement - A given behaviour tends to
increase in frequency if it is followed by an appetitive
(desirable) stimulus.
Negative Reinforcement - A given behaviour also tends to
increase in frequency if it is reliably followed by the
termination of an aversive (undesirable) stimulus.
Punishment - A given behaviour tends to decrease in
frequency if it is reliably followed by an aversive stimulus
Response Cost - A given behaviour tends to decrease in
frequency if it is reliably followed by the termination of an
aversive stimulus
Extinction - The reduction of a behaviour if it is not reinforced
Chapter 5 - Learning and Behaviour
Slide 17
Shaping Behaviours
Shaping - Teaching an organism to learn a new behaviour
through successive approximation.
In the case of a Dolphin learning a new trick, this involves
first rewarding behaviours that are very generally consistent
with the trick … then altering the criterion for reward,
making it more and more specific to the trick
What about a prof learning Karate, or someone learning a
new language?
Chapter 5 - Learning and Behaviour
Slide 18
Intermittent Reinforcement
Refers to situations in which not every occurrence of a
response is reinforced. This leads to an issue termed
schedule of reinforcement
Fixed-ratio: An animal can be rewarded after making
some set number of responses - leads to behaviour bursts
Variable-ratio: Same as above except it is delivered on
average every so many behaviours - leads to rapid and
constant responding (slot machines)
Fixed Interval: Reinforcers can be deliver after a set period
of time has passed - leads to responding just before
Variable Interval: Same as above except random - leads to
slow, steady responding
Chapter 5 - Learning and Behaviour
Slide 19
Resistance to Extinction
A behavior that has been learned on an intermittent
reinforcement schedule is much more resistant to
extinction that one that had been rewarded more
consistently
The higher the ratio of the reinforcement, the higher
the resistance
What does all this suggest about gambling?
Chapter 5 - Learning and Behaviour
Slide 20
Generalization and Discrimination
In classical conditioning, generalization refers to the extent
to which a stimulus similar to the CS can elicit the CR.
In operant conditioning, generalization refers to the extent
to which a stimulus similar to the discriminative stimulus
elicits a response.
Animals can learn to both generalize, and simultaneously,
to discriminate. In a sense, they learn to categorize stimuli
into those that should be responded to, and those that are
not worth the effort.
> Pigeon learning human concept
> Why are weddings so stressful!
Chapter 5 - Learning and Behaviour
Slide 21
The Importance of Secondary
Reinforcers
Most operant conditioning experiments use primary
reinforcers during learning (e.g. food, pain).
However, much of our learning in the real world is
more affected by secondary reinforcers (e.g., money, smiles,
“pats on the back”, compliments).
These secondary reinforcers gained their importance via
good old classical conditioning … being predictive of UCSs
that are associated with primary reinforcers (UCRs).
Without secondary reinforcers we would be focused only on
short-term responding, and would not learn very complex
sets of behaviours … sociopaths?
Chapter 5 - Learning and Behaviour
Slide 22
Conditioning Complex Behaviours
Our society contains many means to shape behaviour
via aversive reinforcers (e.g., fines, jails).
Punishment is an effective means of changing behaviour
and it often leads to fairly immediate results … which
reinforces the punisher of course.
Society cannot always control the positive reinforcers
present in some situation, but it can control the negative.
Chapter 5 - Learning and Behaviour
Slide 23
Current Research with Humans
Rules & Reinforcers
Human behaviour is often an interaction between
reinforcers and rules.
Rules are descriptions (often inaccurate) of behaviours that
will rewarded or punished in various ways
Often people will obey the rules (instructions) originally,
but then modify their behaviour in accordance with the
reinforcers.
> text book example
> rolling stop example
Chapter 5 - Learning and Behaviour
Slide 24
Current Research with Humans
Drug Use and Abuse
Behavioural Psychopharmocology is the study of how drugs
influence behaviour.
In this area, Skinner’s 3 term contingency translates into:
(1) drugs, (2) their effects on behaviour, and
(3) their reinforcing effects
As it turns out, most psychoactive drugs act as strong
reinforcers in both humans and animals
Most preferred drugs correlate with those most abused
by humans - Monkey coke-heads
Chapter 5 - Learning and Behaviour
Slide 25
An Few Final Thoughts
Note once again the lack of attention to “thought” in
all of this
In response to attempts at artificial intelligence, Skinner
responded, “The important question is not whether machines
can be made to think, it is whether humans think”
Consider this in light of the “Conditioning to Kill” situation
and its possible links to post-traumatic stress disorder.
Think also of the quote from “How the Mind Works”
Chapter 5 - Learning and Behaviour
Slide 26