Download 1 Learning Learning Theories/Behaviorism

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Perceptual control theory wikipedia , lookup

Reinforcement learning wikipedia , lookup

Transcript
Learning
 A relatively permanent change in behavior brought about by
experience
Watson’s Extreme Environmentalism
 “Give me a dozen healthy infants, well-formed, and my own
special world to bring them up in, and I’ll guarantee to take
any one at random and train him to be any type of specialist I
might select – doctor, lawyer, artist, merchant-chief , and
yes, beggar-man and thief, regardless of his talents,
penchants, tendencies, abilities, vocations, and race of his
ancestors.”
Learning Theories/Behaviorism
 Habituation
 Pavlov: Classical Conditioning
 Watson
 Skinner: Operant Conditioning
1
Habituation
 Simplest form of learning
 Repeated exposure to a stimulus results in reduced
responsiveness.
Habituation Procedures
 Used to assess cognitive competence


Declining interest indicates learning
Novelty responsiveness indicates discrimination of new versus
familiar
 Older infants habituate faster than younger infants
 Infants of same age require more time to encode complex
stimuli than simple stimuli
Habituation Procedures
 Orienting response: natural attentional response to new
stimulus.
 Habituation: decline in orienting response as initially novel
stimulus becomes familiar.
 Dishabituation: recovery of orienting response when an
habituated stimulus changes.
2
Habituation
Habituation
Classical Conditioning
 An organism comes to associate one stimulus with another.
 Learning that one event predicts another.
3
Classical Conditioning
 Pavlov
 Unconditioned stimulus
 Unconditioned response
 Conditioned stimulus
 Conditioned response
Pavlov’s Original Experiment
 http://www.youtube.com/watch?v=hhqumfpxuzI
Classical Conditioning
4
Example
You are in your dentist's office. Your dentist is looking at your xrays, when he gets that far-off look that only dentists can get when
they are looking at x-rays of The Big One!
 He turns to you, and with a half-sadistic, half-empathetic look,
says, "My, my! I don't see cavities like this very often!"
 You hunker down, experiencing the drilling .
 Take notice of any changes in the way your body is reacting. Pay
particular attention to bodily responses (remember, Pavlovian
conditioning is about involuntary responses--those we do not have
control over!).

Example
 In this demonstration of the dentist's
drill:
 Unconditional Reflex
 UCS – drilling
 UCR - pain
 Conditional Reflex
 CS - sound of drill
 CR - pain
Summary of General Principles of
Classical Conditioning
Any stimulus we can perceive has the potential to become a
conditioned stimulus.
 Any response we make naturally can come to be elicited by any
learned signal.
 These responses can be highly specific and simple (such as a
muscle twitch or part of a brain wave pattern) or general and
complex (such as fear).
 The conditioned response can be a response of our skeletal
muscles or visceral organs or even a "private" response (such as
thoughts and feelings). In other words, the response can be an
overt behavior or reaction or something internal that only you
know is happening.

5
Summary of General Principles of
Classical Conditioning
With a powerful original UCS, conditioning may take place in only
one trial in which the UCS is paired with any CS.
 Stimuli quite different from the original CS can control the
appearance of the conditioned response through second-order
conditioning {DEF: transfer of CR from one CS to another CS.
After you were salivating to Pavlov, I could have presented Pavlov
and a light at the same time. Eventually, you would then salivate to
just the light}.
 Depending on the strength of the CR and the nature of the
conditioning process some learned responses resist extinction don't fade away easily, and may endure a lifetime. This is good
news if the response is something the person wants to continue
doing, this is bad news if the behavior is destructive.

Operant Conditioning
 The learning process by which a given behavior is
changed by the consequences of that behavior.
 An organism learns to behave in ways that produce
reinforcement.
 Watson, B. F. Skinner, Thorndike
Watson – Little Albert Experiment
http://www.youtube.com/watch?v=
0FKZAYt77ZM&feature=related
6
Consequences of Behavior
Reinforcement
 When a consequence
causes an increase in the
performance of the
behavior on which it is
contingent.
7
Reinforcement
 Positive reinforcement: strengthening of a response whose
consequence is a pleasant event.
 Negative reinforcement: strengthening of a response because
it is followed by removal of an unpleasant event.
Factors affecting Positive
Reinforcement
 Selection of the behavior to be increased
 Choice of the reinforcer
 Immediacy
 Instructions
 Schedules of Reinforcement
Schedules of Reinforcement
 Behavior is not necessarily going to be reinforced
every time it occurs
 In real life, behavior is not often reinforced each
time it occurs
 Intermittent reinforcement refers to
reinforcement that is not administered to each
instance of a response
8
Advantages of Intermittent
Reinforcement
 Economizing on time and reinforcers
 Building persistent behavior which is much
more resistant to extinction
Types of Reinforcement Schedules






*Continuous reinforcement: when each instance of a
response is reinforced
* Intermittment: when reinforcement is provided only after
some, but not all occasions of a behaviour
*Fixed ratio: when a certain number of responses occur
prior to the delivery of the reinforcer (e.g., every 5th time)
*Variable ratio: varies the number of responses required
around some average.
*Fixed interval: reinforcement after some fixed length of
time (e.g., after 30 seconds)
*Variable interval: varies the length of time between
reinforced responses around an average interval
Schedules of Reinforcement
 Continuous reinforcement
refers to reinforcement
being administered to each
instance of a response
 Intermittent reinforcement
lies between continuous
reinforcement and
extinction
9
An Example of Continuous
Reinforcement
 Each instance of a smile is reinforced
Fixed Ratio Reinforcement
 A fixed number of responses is required for each reinforcement
 Example: every fourth instance of a smile is reinforced
Graph of Fixed Ratio Responding
10
Fixed Interval Reinforcement
 These schedules require the passage of a specified amount of time
before reinforcement will be delivered contingent on a response
 No response during the interval is reinforced
 The first response following the end of the interval is reinforced
 This schedule usually produces a scalloped pattern of responding
in which little behavior is produced early in the interval, but as
the interval nears an end, the rate of responding increases
 This also produces an overall low rate of responding
Graph of Fixed Interval Responding
Variable Schedules of Reinforcement
 Variable schedules differ from fixed schedules in that the
behavioral requirement for reinforcement varies randomly
from one reinforcement to the next
 This usually produces a more consistent pattern of responding
without post-reinforcement pauses
 Variable ratio schedules produce an overall high consistent rate
of responding
 Variable interval schedules produce an overall low consistent
rate of responding
11
An Example of Variable Ratio
Reinforcement
 Random instances of the behavior are reinforced
Graph of Variable Ratio
Responding
Graph of Variable Interval Responding
12
Punishment
 When a consequence causes a decrease in the
performance of a contingent behaviour
Practice
 Giving a child a time-out

Negative punishment
 Community service time – cleaning up garbage

Positive punishment
 Applause after outstanding concert performance
 Positive reinforcement
 You are freezing and put on a jacket

Negative reinforcement
13
Shaping
 Process of teaching a complex behavior by rewarding closer
and closer approximations of the desired behavior
Limitations of Punishment
 Only suppresses existing behaviors
 Potential serious social
consequences
 Learned helplessness
 Can lead to aggression/antisocial
behaviors
 Only works in presence of
punisher
Using Punishment Effectively
 Punish as soon as possible
 Punish with appropriate amount of intensity
 Punish consistently
 Be otherwise warm
 Explain yourself
 Reinforce alternative behavior
 Consider alternative responses to misbehavior
14