Download Burrhus Frederic Skinner - Back

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Social Bonding and Nurture Kinship wikipedia , lookup

Social psychology wikipedia , lookup

Bullying and emotional intelligence wikipedia , lookup

Motivation wikipedia , lookup

Classical conditioning wikipedia , lookup

Prosocial behavior wikipedia , lookup

Behavioral modernity wikipedia , lookup

Observational methods in psychology wikipedia , lookup

Symbolic behavior wikipedia , lookup

Abnormal psychology wikipedia , lookup

Social perception wikipedia , lookup

Organizational behavior wikipedia , lookup

Insufficient justification wikipedia , lookup

Parent management training wikipedia , lookup

Neuroeconomics wikipedia , lookup

Thin-slicing wikipedia , lookup

Attribution (psychology) wikipedia , lookup

Transtheoretical model wikipedia , lookup

Theory of planned behavior wikipedia , lookup

Applied behavior analysis wikipedia , lookup

Theory of reasoned action wikipedia , lookup

Descriptive psychology wikipedia , lookup

Sociobiology wikipedia , lookup

Psychological behaviorism wikipedia , lookup

Social cognitive theory wikipedia , lookup

Verbal Behavior wikipedia , lookup

Behavior analysis of child development wikipedia , lookup

Behaviorism wikipedia , lookup

Operant conditioning wikipedia , lookup

Transcript
Burrhus Frederic Skinner
(1904 - 1990)
Chapter 5
1
Burrhus Frederic Skinner
1. Born Mar. 20, 1904
Susquehanna,
Pennsylvania.
2. Did his PhD (1931) from
Harvard.
3. Wanted to become a
writer was disappointed to
learn that he had nothing
to write about, instead
became a great
psychologist.
1904-1990
2
Burrhus Frederic Skinner
www.simplypsychology.pwp.blueyonder.co.uk
4. Wrote The behavior of
organisms (1938). Walden
two (1948), after Thoreau’s
Walden.
5. Taught at University of
Minnesota (1936-48).
6. Chair at Indiana University
(1945/48).
7. Came back to Harvard
(1948-90).
1904-1990
3
1
Burrhus Frederic Skinner
8.
Beyond freedom and
dignity (1971).
9.
images-cdn01.associatedcontent.com
About behaviorism
(1976).
10. Upon further reflection
(1987).
11. Continued to publish to
the end of his life in
journals like Analysis of
Behavior (1989).
1904-1990
4
Burrhus Frederic Skinner
12. Great contributions to
learning and education.
13. Contributions to child
development.
14. Project ORCON (ORganic
CONtrol).
15. Died in 1990.
Project ORCON
pavlov.psicol.unam.mx:8080
5
1904-1990
Comparison
Operant Conditioning
Respondent Conditioning
Skinnerian or operant
conditioning
Type R conditioning
reinforcing stimulus is
contingent upon a
response
Classical, Pavlovian, or
respondent conditioning
Type S conditioning
reinforcing stimulus is
contingent upon a
stimulus
S
R
S (Food)
S
S (Food)
R
6
2
Comparison Continued
Operant Conditioning
Respondent Conditioning
Responses are emitted to Responses are elicited to
a known reinforcer.
a known stimulus.
Conditioning strength
Conditioning strength
= Rate of response
= Response magnitude
7
Theoretical Differences
Functionalists
Associationists
Edward Thorndike
Burrhus Skinner
Concentrated on
responses as they
brought about
consequences.
Ivan Pavlov
Edwin Guthrie
S
R
S
Concentrated on stimuli
as they brought
responses.
S
S
R
8
Radical Behaviorism
1. Behavior cannot be explained on the basis of
drive, motivation and purpose. All of these take
psychology back to its mentalistic nature.
2. Behavior has to be explained on the basis of
consequences (reinforcements, punishments)
and environmental factors. This, Skinner
proposed, was the back bone of all scientific
psychology.
9
3
Principles of Operant Learning
1. We need to know what is reinforcing for the
organism. How can we find a reinforcer? It is
merely a process of selection, which is difficult
to determine. Reinforcers related to bodily
conditions are easy to determine, like food and
water.
2. This reinforcement will predict response.
3. Reinforcement increases rate of responding.
10
Operant Chambers
Skinner devised operant chambers for rats and
pigeons to study behavior in a controlled
environment. Operant chambers opportunities to
control reinforcements and other stimuli.
11
Magazine Training
1. At the beginning of this training the rat is
deprived (a procedure) of food for 23 hours, and
placed in the operant chamber.
2. The experimenter presses a hand held switch
which makes a clicking sound (secondary
reinforcer) and a food pellet (primary reinforcer)
drops in the food magazine.
3. The rat learns to associate the clicking sound
with the food pellet.
12
4
Magazine Training
4. To train the rat to come to the food magazine
and eat food, the experimenter presses the
switch when the rat is near the food magazine.
After a few trials the rat associates clicking
sound with coming of the food, and stays close
to the magazine to eat food.
Lever
Food
Pellet
Food
Magazine
13
Shaping
1. To train the rat to press the lever and get a food,
the experimenter shapes rat’s behavior. Shaping
involves reinforcing (secondary) rat for
behaviors that approximate the target behavior,
i.e., coming closer and closer to the lever and
finally pressing it. This procedure is called
successive approximation.
2. To shape lever-pressing behavior, differential
reinforcement can also be used. In this
procedure only lever-pressing behaviors are
reinforced not others.
14
Cumulative Responses
Cumulative Recording
Second Response
Operant Level
Paper
Movement
One Response
Time
15
5
Cumulative Responses
Responding Rate
Slow rate of
responding
Rapid rate of
responding
Shallow
Steep trace
trace
Time
16
Cumulative Responses
Cumulative Responses: Sniffy
75
Responses
75
Responses
75
Responses
17
Extinction
S
Lever
R
Lever
pressing
response
S
Food
Remove reinforcement (food) and the lever
pressing behavior is extinguished.
18
6
Cumulative Responses
Extinction
No
Food
Extinction
(Operant
Level)
Time
19
Spontaneous Recovery
Behavior (Cumulative Responses)
Just as we have spontaneous recovery in classical
conditioning, a restful period after extinction
initiates lever-pressing response in the animal.
60
50
Spontaneous
Recovery
40
30
Extinction
& Rest
20
10
0
5
10
15
20
25
30
20
Trials
Discrimination Learning
The organism can be conditioned to discriminate
between two or more stimuli. A discriminative
operant is a response that is emitted specifically to
one stimulus (SD) but not the other (SΔ).
Discriminative
Stimulus
Response
Reinforcement
Light ‘ON’ (SD)
Press lever
Food
Light ‘OFF’ (SΔ)
Lever not
pressed
No Food
21
7
Secondary Reinforcement
“Any neutral stimulus paired with a primary
reinforcer (e.g., food or water) takes on reinforcing
properties of its own" (Hergenhahn and Olson,
2001)” and is called a secondary stimulus. Thus, all
discriminative stimuli are secondary reinforcers.
22
Generalized Reinforcers
1. A secondary reinforcer can become a
generalized reinforcer when paired with a
number of primary reinforcers. Money then is a
generalized reinforcer, for it is associated with
primary reinforcers like food, drink and mates.
2. Secondary reinforcer is similar to Allport’s
(1961) idea of functional autonomy. First there
is activity for reinforcement, but then the
activity by itself becomes reinforcing, e.g.,
joined merchant navy for money but now enjoys
sailing for its own sake.
23
Chaining
A discriminative stimulus (S D) initiates a response
(SR) which serves as a stimulus (S D) for the next
response (SR) and so on till the final response (R) is
followed by primary reinforcement.
SD
R
SD
SR
Many
stimuli
Orients
Sight of
lever
R
SD
SR
Approaches Contact
lever
R
SR
Presses
bar
Food
Pellet
Similar to Guthrie’s movement-produced stimuli.
24
8
Reinforcement & Punishment
If response is followed by a reinforcer then the
response increases. However, if it is followed by a
punisher then the response decreases.
25
Reinforcement
Reinforcer Contingency
Example
Behavior
Doing work getting
food
Studying books
getting good
grades
Work
increases
Primary
Positive
Secondary
Positive
Primary
Negative
Heater proximity
avoids cold
Secondary
Negative
Waking early
avoiding traffic
Studying
increases
Heater
proximity
increases
Waking
early
increases26
Punishment
Punisher Contingency
Primary
Positive
Secondary
Positive
Primary
Negative
Secondary
Negative
Example
Behavior
Work with
electricity get
shock
Insult boss get
reprimanded
Work with
electricity
decreases
Insulting boss
decreases
Quarrelsome
Quarrelsome
behavior
behavior lose food
decreases
Coming home late Coming late
no going out
decreases
27
9
Consequences & Contingencies
Contingency
Positive
Negative
Reinforcement
Behavior
increases
Behavior
increases
Punishment
Behavior
decreases
Behavior
decreases
Consequence
Like Thorndike, Skinner believed that positive
reinforcement strengthened behavior but
punishment did not weaken behavior.
28
Estes’s Punishment Experiment
500
Cumulative Responses
No reinforcement + punishment
400
No reinforcement
300
200
100
0
1
2
3
Extinction Session
29
Punishment
1. Unwanted emotional byproducts
(generalized fears).
2. Conveys no information to the organism.
3. Justifies pain to others.
4. Unwanted behaviors reappear in its
absence.
5. Aggression towards the agent.
6. One unwanted behavior appears in place of
another.
30
10
Punishment
Why punishment?
It reinforces the punisher!
Alternatives to Punishment
1. Do not reinforce the unwanted behavior.
2. Let the individual engage in the undesirable
behavior for long till he is sick of it.
3. Wait for the unwanted behavior to dissolve over
development.
31
Schedule of Reinforcement
A. When a response is always followed by
reinforcement it is called continuous
reinforcement. Such a response after learning is
easy to extinguish.
B. When occurrence of reinforcement is
probabilistic it is termed as partial
reinforcement, and is difficult to extinguish.
During partial reinforcement superstitious
behaviors arise. An animal behaves peculiarly to
get reinforcement, when its not being received.
32
Ratio Schedules
1. Reinforcement that occurs after every nth
response is called fixed ratio schedule. For
example, when the rat presses the bar 5 times
to get food, it is on FR5 schedule.
2. Reinforcement occurs after an average of n
responses is known as variable ratio schedule.
Sometimes the reinforcement is introduced
after 3 bar presses at other times 8 bar presses,
however, the average bar presses equals 5.
Abbreviated as VR5.
33
11
Interval Schedules
3. When reinforcement occurs after a specified
interval of time is called fixed Interval schedule.
Animal gets food after 5 seconds. Abbreviated
as FI5.
4. When reinforcement occurs after an average
interval of time is called variable Interval
schedule. Some times the rat gets the food
pellet after 3 seconds and some times after 8
seconds however the average time interval
equals 5 seconds (VI5).
34
Schedules of Reinforcement
Different learning curves emerge with different
reinforcement schedules. For ratio schedules they
are steeper than interval schedules.
Sequence
Fixed
Variable
Ratio
Domain
Interval
35
Concurrent Schedules
VI5
VI10
Behavior (Cumulative Responses)
5. Concurrent schedules provide two simultaneous
schedules of reinforcements, organisms
(pigeons) will distribute their responses
according to these schedules (Skinner, 1950).
5
VI 5
4
VI 10
3
2
1
0
1
5
10
15
20
25
30
36
Time (Minutes)
12
Herrnstein Matching Law
Herrnstein (1970; 1974) showed with a
mathematical equation that relative reinforcement
equals relative response (behavior).
1.0
B1
B1 + B2
=
R1
R1 + R2
0.9
Relative Behavior Red Key
Red Key Green Key
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
37
Relative Reinforcement Red Key
Simple Choice behavior
Gratification from rewards can be immediate or
delayed. Our simple choice behaviors are dictated
by these reinforcements accordingly.
Delayed
Reward
Study
Going to
the movies
Gratification
with a good
grade
Immediate
Reward
Gratification by seeing a
movie
38
Concurrent Chain Schedule
6a. Concurrent chain schedule produce complex
choice behaviors so under one condition
pigeons preferred small sooner reinforcer
(Rachlin & Green, 1972).
Light
Delay
2 sec
6 seconds
4 seconds
difference
Reinforcement
2 sec
of grain
6 sec
of grain
39
13
Concurrent Chain Schedule
6b. And in the other condition, pigeons preferred
large delayed reinforcers (Rachlin & Green,
1972).
Light
Delay
Reinforcement
20 seconds
24 seconds
2 sec
of grain
6 sec
of grain
4 seconds
difference
40
Complex Choice Behavior
Thus organisms (human and animal) behave
differently to different rewards. Selection of rewards
in a complex choice situation is based on a
combination of reward imminence (how large or
small they are) and reward delay (length of time to
reach them).
41
Progressive Ratio Schedule
7a. Progressive ratio schedule provides a tool to
measure the efficacy of a reinforcer. To
determine whether one reinforcer is more
effective than the other, progressive ratio
schedule requires the organism to indicate in
behavioral terms the maximum it will “pay “
for a particular reinforcer.
42
14
Progressive Ratio Schedule
7b. The organism is trained on a fixed ratio
schedule say FR2 and receives say 5 pellets of
food. The schedule is increased to FR4, so now
the animal makes 4 responses before it gets 5
pellets of food. The schedule is increased to
FR8 and so on. There comes a time for a
schedule (FR64) that the animal is not willing
to engage in responses to get the
reinforcement.
43
Progressive Ratio Schedule
7c. We can compare two reinforcements (food
and water) and determine at which schedule
the animal breaks down for them, thus
comparing their efficacy. Food breaks down
before water.
Mean Log Reinforcement Rate
16
Reinforcement A (Food)
14
Reinforcement B (Water)
12
10
8
6
4
2
0
0
1
2
4
8
16
32
Log FR Schedule
64
128
256
512
44
Verbal Behavior
Like any other behavior language (verbal behavior)
is also a behavior and largely consists of speaking,
listening, writing and reading behaviors. These
behaviors are governed by antecedent conditions
(stimuli), and consequences (reinforcements).
45
15
Types of Verbal Behavior
1. Mand (from demand or command): A listening
or talking behavior. The individual (child)
behaves appropriately to the command given by
another (adult) and is reinforced. The child may
also request (demand) something to relieve a
need.
The adult says, “look
(mand) I have a toy
for you”. The child
looks (behaves) and
is reinforced with the
toy (reinforcement).
46
Types of Verbal Behavior
2. Echoic Behavior: A talking behavior. A word or a
sentence repeated verbatim. Can be loud or
silent as in reading. The adult says “cookies”
(stimulus) the child echoes the word (behavior)
and gets a smile (reinforcement).
Cookies
Cookies
Audible
Silent
47
Types of Verbal Behavior
3. Tact: A talking behavior. A verbal behavior in
which individuals correctly names or identifies
(tact) objects (stimuli) and the other individuals
reinforce them for a correct match.
Flowers
Good
48
16
Types of Verbal Behavior
4. Autoclitic Behavior: A talking behavior. This
behavior (autoclitic) occurs when a question
(stimulus) is posed. The answer to the question
is followed by reinforcement (praise). Also called
intraverbal behavior.
Which mammal
lives in the sea?
A whale!
49
ABC of Verbal Behavior
Type
Antecedent (A)
Behavior (B)
Consequence (C)
Mand
State of Deprivation or
aversive stimulation
Verbal utterance
Reinforcer that
reduces state of
deprivation
Echoic
Verbal utterance from
another individual
Repetition of what
the speaker says
Conditioned
reinforcement
(praise) from the
other person
Tact
Stimulus (usually
object) in the
environment
Verbal utterance
naming or referring
to the object
Conditioned
reinforcement from
the other person
Autoclitic
Verbal utterance
(often a question)
from another person
Verbal response
(answer to a
question)
Verbal feedback or
reinforcement
Based on Skinner (1957)
50
Programmed Learning
Skinner was interested in applying theory of learning
to education, therefore introduced teaching
machines. Electromechanical devices that promoted
teaching and learning.
upload.wikimedia.org
51
17
Programmed Learning
1. Teaching machines provide sustained activity.
2. Insures a point is understood before moving on
(small steps).
3. Presents learner with material he is ready for.
4. Helps learner find the right answer.
5. Provides immediate feedback.
52
Learning Theory & Behavior
Technology
1. Skinner did not believe in formulating a theory
of learning, the way Hull did.
2. Behavior should be explained in terms of
stimuli, not physiology.
3. Functional analysis of stimuli and behaviors
should be the goal of psychology not the “why
of behaviors”.
4. We need behavior technology to resolve human
problems. But our culture, government and
religion erodes reinforcements to problem-free
53
behaviors.
David Premack
1. Born: October 26, 1925,
Aberdeen, South Dakota.
2. Started working at the
Yerkes Primate Biology
Laboratory (1954).
3. Intelligence in Apes and
Man (1976).
The Mind of an Ape (1983).
Original Intelligence: The
Architecture of the Human
Mind (2002).
1925-Present
54
18
David Premack
4. Emeritus professor of
psychology at the
University of Pennsylvania.
5. William James Fellow
Award (2005).
1925-Present
55
Premack Principle
Responses (behaviors) that occurred at a higher
frequency could be used as reinforcers for responses
that occurred with low frequency. In other words
High-probability behavior (HPB) can be used to
reinforce low-probability behavior (LPB).
Eating
(HPB)
Grooming
(HPB)
Grooming
(LPB)
In order to increase grooming behavior
(LPB), eating behavior (HPB) was used
as a reinforcer. Each time the animal
groomed, it was given the opportunity
to eat. His grooming behavior
increased.
Proportion of behavior
in the animal
56
Relativity of Reinforcement
To test his theory in humans, Premack took 31, 1st
graders and gave them gumball and pinball machine
to play with. Based on their activity he was able to
classify them into eaters and manipulators.
Phase I
Gumball
machine
Pinball
Machine
57
19
Relativity of Reinforcement
Phase II
If the child was an eater, he
was only allowed to eat if he
played the pinball machine.
If the child was a manipulator,
he was only allowed to play if he
ate from the gumball machine.
Playing behavior increased!
Eating behavior increased!
58
Transituational Nature of
Reinforcement
A high probability behavior like eating will become a
low probability behavior if the animal eats. Not only
does the probability of the behavior changes, but
the very nature of the reinforcement changes with
time.
Food
Rewarding
Neutral
Punishing
Nature of reinforcement over time (Kimble, 1993).
59
Disequilibrium Hypothesis
Timberlake (1980) suggests that any activity can
become a reinforcer if the activity is blocked in some
way. If drinking is blocked a state of disequilibrium is
produced in the animal, and now can be used as a
reinforcer.
State of
Disequilibrium
30%
20%
10%
Eating
10%
Drinking Activity
Wheel
60
20
Marian Breland Bailey
1. Born Dec. 2, 1920 in
Minneapolis, Minnesota.
2. Became the second PhD
student under Skinner moved
to Hot Spring and relocated
Animal Behavior Enterprises
(ABE).
3. Studied functional analysis of
behavior and taught at
Henderson State University.
4. Died Sep. 25, 2001.
1925-2001
61
Instinctive Drift
When instinctive behavior comes in conflict with
conditioned operant behavior, animals show a
tendency to drift in the direction of instinctive
behavior.
Marian Breland and Keller Breland trained raccoons to
put a wooden coins in a box (commercial for a saving
bank) but raccoons had trouble putting the coins in the
box especially, when there were two coins to deposit.
Brelands argued that raccoons instinctive behavior of
washing (rubbing) the food before eating came in
conflict with the learnt behavior.
62
Questions
17. Would you use the same reinforcers to
manipulate the behavior of both children and
adults? If not what would make the
difference?
18. What is partial reinforcement effect? Briefly
describe the ratio and interval reinforcement
schedules studied by Skinner.
19. Explain the difference between Premack’s and
Timberlake’s views of reinforcers.
63
21