Download LOGO - BCE Lab

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Neuroeconomics wikipedia , lookup

Learning theory (education) wikipedia , lookup

Applied behavior analysis wikipedia , lookup

Verbal Behavior wikipedia , lookup

Insufficient justification wikipedia , lookup

Behavior analysis of child development wikipedia , lookup

Behaviorism wikipedia , lookup

Psychological behaviorism wikipedia , lookup

Eyeblink conditioning wikipedia , lookup

Psychophysics wikipedia , lookup

Classical conditioning wikipedia , lookup

Operant conditioning wikipedia , lookup

Transcript
Chapter 5
Conditioning
Nov 28th, 2012
Learning: Some Key Terms
LOGO
 Learning: Relatively permanent change in behavior due
to experience
 Does NOT include temporary changes due to disease,
injury, maturation, injury, or drugs, since these do
NOT qualify as learning
 Reinforcement: Any event that increases the probability
that a response will recur
 Response: Any identifiable behavior
 Internal: Faster heartbeat
 Observable: Eating, scratching
Learning: More Key Terms
 Antecedents: Events that precede a response
 Consequences: Effects that follow a response
LOGO
Classical Conditioning and Ivan Pavlov
 Russian physiologist who initially was
studying digestion
 Used dogs to study salivation when
dogs were presented with meat
powder
 Also known as Pavlovian or
Respondent Conditioning
 Reflex: Automatic, nonlearned innate
response e.g., an eyeblink
 Classical conditioning: A type of
learning in which a neutral stimulus
comes to bring about a response after
it is paired with a stimulus that naturally
brings about that response.
LOGO
Classical Conditioning
LOGO
 Neutral stimulus: A stimulus that, before conditioning,
does not naturally bring about the response of interest.
 Unconditioned stimulus (UCS): A stimulus that
naturally brings about a particular response without
having been learned.
 Unconditioned response (UCR): A response that is
natural and needs no training (e.g., salivation at the
smell of food).
 Conditioned stimulus (CS): A once neutral stimulus
that has been paired with an unconditioned stimulus to
bring about a response formerly caused only by the
unconditioned stimulus.
 Conditioned response (CR): A response that, after
conditioning, follows a previously neutral stimulus (e.g.,
salivation at the ringing of a bell).
Figure 6.1
LOGO
FIGURE 6.1 In classical conditioning, a stimulus that does not produce a response is paired
with a stimulus that does elicit a response. After many such pairings, the stimulus that
previously had no effect begins to produce a response. In the example shown, a horn precedes
a puff of air to the eye. Eventually, the horn alone will produce an eye-blink. In operant
conditioning, a response that is followed by a reinforcing consequence becomes more likely to
occur on future occasions. In the example shown, a dog learns to sit up when it hears a whistle.
Figure 6.2
LOGO
FIGURE 6.2 An apparatus for Pavlovian conditioning. A tube carries saliva from the dog’s
mouth to a lever that activates a recording device (far left). During conditioning, various
stimuli can be paired with a dish of food placed in front of the dog. The device pictured here is
more elaborate than the one Pavlov used in his early experiments.
LOGO
Figure 6.3
FIGURE 6.3 The classical conditioning procedure.
LOGO
Principles of Classical Conditioning
 Acquisition: Training period when a response is
reinforced
 Higher Order Conditioning: A conditioned stimulus is
used to reinforce further learning
 Expectancy: Expectation about how events are
interconnected
 Extinction: Weakening of a conditioned response
through removal of reinforcement
 Spontaneous Recovery: Reappearance of a learned
response following apparent extinction
LOGO
LOGO
Figure 6.4
FIGURE 6.4 Acquisition and extinction of a conditioned response.
LOGO
Figure 6.5
LOGO
FIGURE 6.5 Higher order conditioning takes place when a well-learned conditioned stimulus is
used as if it were an unconditioned stimulus. In this example, a child is first conditioned to
salivate to the sound of a bell. In time, the bell will elicit salivation. At that point, you could clap
your hands and then ring the bell. Soon, after repeating the procedure, the child would learn to
salivate when you clapped your hands.
Principles of Classical Conditioning (cont'd)
LOGO
 Stimulus Generalization: A tendency to respond to
stimuli that are similar, but not identical, to a conditioned
stimulus (e.g., responding to a buzzer or a hammer
banging when the conditioning stimulus was a bell)
 Stimulus Discrimination: The learned ability to respond
differently to various stimuli (e.g., Paula will respond
differently to various bells (alarms, school, timer))
Classical Conditioning in Humans
LOGO
 Phobia: Intense, unrealistic, irrational fear of a specific
situation or object (e.g., arachnophobia; fear of spiders;
see the movie!)
 Conditioned Emotional Response: Learned emotional
reaction to a previously neutral stimulus
 Desensitization: Exposing phobic people gradually to
feared stimuli while they stay calm and relaxed
 Vicarious Classical Conditioning: Learning to respond
emotionally to a stimulus by observing another’s
emotional reactions
Figure 6.7
LOGO
FIGURE 6.7 Hypothetical example of a CER becoming a phobia. Child approaches dog (a) and
is frightened by it (b). Fear generalizes to other household pets (c) and later to virtually all furry
animals (d).
Operant Conditioning (Instrumental Learning)
LOGO
 Definition: Learning based on the consequences of
responding; we associate responses with their
consequences
 Law of Effect (Thorndike): The probability of a response
is altered by the effect it has; responses that lead to
desired effects are repeated; those that lead to
undesired effects are not
 Operant Reinforcer: Any event that follows a response
and increases its likelihood of recurring
 Conditioning Chamber (Skinner Box): Apparatus
designed to study operant conditioning in animals
 Response-Contingent Reinforcement: Reinforcement
given only when a particular response occurs
LOGO
Figure 6.8
LOGO
FIGURE 6.8 Assume that a child who is learning to talk points to her favorite doll and says either
“doll,” “duh,” or “dat” when she wants it. Day 1 shows the number of times the child uses each
word to ask for the doll (each block represents one request). At first, she uses all three words
interchangeably. To hasten learning, her parents decide to give her the doll only when she names
it correctly. Notice how the child’s behavior shifts as operant reinforcement is applied. By day 20,
saying “doll” has become the most probable response.
Figure 6.9
LOGO
FIGURE 6.9 The Skinner box. This simple device, invented by B. F. Skinner, allows careful study
of operant conditioning. When the rat presses the bar, a pellet of food or a drop of water is
automatically released.
Timing of Reinforcement
LOGO
 Operant reinforcement most effective when given
immediately after a correct response
 Response Chain: A linked series of actions that leads to
reinforcement
 Superstitious Behavior: Behavior that is repeated to
produce reinforcement, even though it is not necessary
 Shaping: Molding responses gradually to a desired
pattern
 Successive Approximations: Ever-closer matches
Figure 6.10
LOGO
FIGURE 6.10 Reinforcement and human behavior. The percentage of times that a severely
disturbed child said “Please” when he wanted an object was increased dramatically by reinforcing
him for making a polite request. Reinforcement produced similar improvements in saying “Thank
you” and “You’re welcome,” and the boy applied these terms in new situations as well.
Figure 6.11
LOGO
FIGURE 6.11 Average number of innings pitched by major league baseball players before and
after signing long-term guaranteed contracts.
Figure 6.12
LOGO
FIGURE 6.12 The effect of delay of reinforcement. Notice how rapidly the learning score drops
when reward is delayed. Animals learning to press a bar in a Skinner box showed no signs of
learning if food reward followed a bar press by more than 100 seconds
Operant Extinction
LOGO
 Definition: When learned responses that are NOT
reinforced gradually fade away
 Negative Attention Seeking: Using misbehavior to gain
attention
More Operant Conditioning Terms
LOGO
 Positive Reinforcement: When a response is followed by
a reward or other positive event
 Negative Reinforcement: When a response is followed
by the removal of an unpleasant event (e.g., the bells in
Fannie’s car stop when she puts the seatbelt on) or by
an end to discomfort
 Punishment: Any event that follows a response and
decreases the likelihood of it recurring (e.g., a spanking)
 Response Cost: Removal of a positive reinforcer after a
response is made
Figure 6.14
LOGO
FIGURE 6.14 In the apparatus shown in (a), the rat can press a bar to deliver mild electric
stimulation to a “pleasure center” in the brain. Humans also have been “wired” for brain
stimulation, as shown in (b). However, in humans, this has been done only as an experimental
way to restrain uncontrollable outbursts of violence. Implants have not been done merely to
produce pleasure.
Types of Operant Reinforcers
LOGO
 Primary Reinforcer: Nonlearned and natural; satisfies
biological needs (e.g., food, water, sex)
 Intracranial Stimulation (ICS): Natural primary reinforcer;
involves direct electrical activation of brain’s “pleasure
centers”
 Secondary Reinforcer: Learned reinforcer (e.g., money,
grades, approval)
 Token Reinforcer: Tangible secondary reinforcer (e.g.,
money, gold stars, poker chips)
 Social Reinforcer: Learned desires for attention and
approval
Figure 6.16
LOGO
FIGURE 6.16 Reinforcement in a token economy. This graph shows the effects of using tokens
to reward socially desirable behavior in a mental hospital ward. Desirable behavior was defined
as cleaning, making the bed, attending therapy sessions, and so forth. Tokens earned could be
exchanged for basic amenities such as meals, snacks, coffee, game-room privileges, or
weekend passes. The graph shows more than 24 hours per day because it represents the total
number of hours of desirable behavior performed by all patients in the ward.
Feedback and Knowledge of Results
 Definition: Information about the effect of a response
 Knowledge of Results (KR): Informational feedback
LOGO
Programmed Instruction
LOGO
 Information is presented in small amounts, gives
immediate practice, and provides continuous feedback.
 Computer-Assisted Instruction (CAI): Learning is aided
by computer-presented information and exercises.
Figure 6.17
LOGO
FIGURE 6.17 To sample a programmed instruction format, try covering the terms on the left with
a piece of paper. As you fill in the blanks, uncover one new term for each response. In this way,
your correct (or incorrect) responses will be followed by immediate feedback.
Figure 6.18
LOGO
FIGURE 6.18 Computer-assisted instruction. The screen on the left shows a typical drill-andpractice math problem, in which students must find the hypotenuse of a triangle. The center
screen presents the same problem as an instructional game to increase interest and motivation.
In the game, a child is asked to set the proper distance on a ray gun in the hovering space ship
to “vaporize” an attacker. The screen on the right depicts an educational simulation. Here,
students place a “probe” at various spots in a human brain. They then “stimulate,” “destroy,” or
“restore” areas. As each area is altered, it is named on the screen and the effects on behavior
are described. This allows students to explore basic brain functions on their own.
Partial Reinforcement
LOGO
 Definition: Reinforcers do NOT follow every response
 Schedules of Reinforcement: Plans for determining
which responses will be reinforced
 Continuous Reinforcement: A reinforcer follows every
correct response
 Partial Reinforcement Effect: Responses acquired with
partial reinforcement are very resistant to extinction
Schedules of Partial Reinforcement
LOGO
 Fixed Ratio Schedule (FR): A set number of correct
responses must be made to obtain a reinforcer.
 Variable Ratio Schedule (VR): Varied number of correct
responses must be made to get a reinforcer.
 Fixed Interval Schedule (FI): The first correct response
made after a certain amount of time has elapsed is
reinforced; produces moderate response rates.
 Variable Interval Schedule (VI): Reinforcement is given
for the first correct response made after a varied amount
of time
LOGO
Figure 6.19
FIGURE 6.19 Typical response patterns for reinforcement schedules.
Stimulus Control
LOGO
 Stimuli that consistently precede a rewarded response
tend to influence when and where the response will
occur
 Operant Stimulus Generalization: Tendency to respond
to stimuli similar to those that preceded operant
reinforcement
 Operant Stimulus Discrimination: Occurs when one
learns to differentiate between the stimuli that signal
either an upcoming reward or a nonreward condition
Punishment
LOGO
 Punisher: Any consequence that reduces the frequency
of a target behavior
 Keys: Timing, consistency, and intensity
 Severe Punishment: Intense punishment, capable of
suppressing a response for a long period
 Mild Punishment: Weak punishment; usually slows
responses temporarily
Punishment Concepts
LOGO
 Aversive Stimulus: Stimulus that is painful or
uncomfortable (e.g., a shock)
 Escape Learning: Learning to make a response to end
an aversive stimulus
 Avoidance Learning: Learning to make a response to
avoid, postpone, or prevent discomfort (e.g., not going to
a doctor or dentist)
 Punishment may also increase aggression
Figure 6.21
LOGO
FIGURE 6.21 The effect of punishment on extinction. Immediately after punishment, the rate of
bar pressing is suppressed, but by the end of the second day, the effects of punishment have
disappeared.
Figure 6.22
LOGO
FIGURE 6.22 Types of reinforcement and punishment. The impact of an event depends on
whether it is presented or removed after a response is made. Each square defines one possibility:
Arrows pointing upward indicate that responding is increased; downward-pointing arrows indicate
that responding is decreased.
Figure 6.21
Emotional manipulation
LOGO
Covert Aggression
Psychopaths: Wolves in sheep’s clothing
Figure 6.21
Emotional manipulation
LOGO
环境破坏大王-“我不好过,我也不让你好过”.
聚光灯爱好者-“你说你头痛?你知道吗?我有
脑癌!”
乐于助人者-“我这么帮你,你要拿什么回报
我?”
无责任失败者-“欠错万错,不是我的错!”
损人高手-“你看你,怎么那么胖啊!啊,你生什
么气啊,我开玩笑的.”
负罪感大师-“不管你怎么做你都对不起我…”
暗示女王-“你如果爱我,就明白我的意思 “
Thank you!