Download quantity or quality of the reinforcer

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Motivation wikipedia , lookup

Learning theory (education) wikipedia , lookup

Abnormal psychology wikipedia , lookup

Impulsivity wikipedia , lookup

Psychophysics wikipedia , lookup

Behavioral modernity wikipedia , lookup

Thin-slicing wikipedia , lookup

Insufficient justification wikipedia , lookup

Residential treatment center wikipedia , lookup

Attribution (psychology) wikipedia , lookup

Sociobiology wikipedia , lookup

Theory of planned behavior wikipedia , lookup

Verbal Behavior wikipedia , lookup

Theory of reasoned action wikipedia , lookup

Applied behavior analysis wikipedia , lookup

Adherence management coaching wikipedia , lookup

Parent management training wikipedia , lookup

Neuroeconomics wikipedia , lookup

Behavioral economics wikipedia , lookup

Behavior analysis of child development wikipedia , lookup

Classical conditioning wikipedia , lookup

Behaviorism wikipedia , lookup

Psychological behaviorism wikipedia , lookup

Operant conditioning wikipedia , lookup

Transcript
Making Responses for your Reinforcer
Instrumental Conditioning Procedures
• There are 4 basic types of instrumental conditioning Table 5.1
• Instrumental Conditioning Procedures Table
• 1. Nature of the outcome controlled by the behavior
– Appetitive stimulus – pleasant outcome (food)
– Aversive stimulus – unpleasant outcome (shock)
• 2. Relationship or contingency between the response “R” and the
outcome “O”
– a. Positive contingency – R produces O
– b. Negative contingency – R eliminates/prevents O
Mow lawn – money
Get on table – yelling
Close window – prevents rain
Stay out late – removes driving
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
Instrumental Conditioning Procedures
• Positive reinforcement
– response produces an appetitive outcome
– positive contingency between the response and an appetitive stimulus
– Experimental Examples
• rat bar-press for food
• Pigeon key pecks for food
– Everyday examples
• Child gets a cookie for putting toys away
• Student gets praise for a good report
– Positive reinforcement will increase responses
Instrumental Conditioning Procedures
• Punishment (Positive Punishment)
–
–
–
–
response produces an aversive outcome
positive contingency between the response and an aversive stimulus
if the subject performs the response, it receives the aversive outcome
if the subject does not perform the response, it does not receive the aversive
outcome
– Experimental example
• shock a rat whenever it makes a bar press
– Everyday examples
• parents spank their child for playing in the street
• Robber is put in prison
– Punishment will decrease responses
Instrumental Conditioning Procedures
• Negative reinforcement
– negative contingency between the response and an aversive stimulus
– the response terminates or prevents the delivery of an aversive stimulus
– Experimental Example
• rat could jump over a barrier to prevent getting shocked
– Everyday example
• students can study before an exam to avoid getting a bad grade
• Look both ways before crossing a street to prevent getting squished
– Negative reinforcement will increase responses
Instrumental Conditioning Procedures
• Negative Punishment “Omission Training”
– negative contingency between the response and an appetitive stimulus
– the response prevents the delivery of a pleasant event
– Experimental example
• negative automaintenance: start with sign tracking procedure
• Then food access was cancelled if a key peck occurred while the response key was
illuminated
– Every day example
• a child is told to go to his room when he does something bad
• suspending a driver’s license for impaired driving
– Omission Training will decrease responses
• Differential Reinforcement of Other behaviors (DRO)
–
–
–
–
Positive Reinforcement in addition to Omission training
When the subject can receive the reward for engaging in other behaviors
used in clinical settings to reduce self-injurious behavior
See figure 5.7
DRO
25-24
DRO
32-72
FIGURE 5.7
Rate of Bridget’s self-injurious behavior during baseline sessions (1–19 and 25–31) and during
sessions in which a DRO contingency was in effect (20–24 and 32–72) (based on Lindberg et
al.,1999).
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
Fundamental Elements of Instrumental Conditioning
• involves three key elements
– an instrumental response ( R )
– an instrumental reinforcer (the outcome) ( O )
– a relation, or contingency, between the R and O
The Instrumental Response
•
•
•
•
Instrumental response is the behavior producing the outcome
Most of the example will involve positive reinforcement
Negative reinforcement and Punishment are covered in chapter 10
Characteristics of the response determine how easy it is to modify
the response with reinforcers which is explained in the following
subsections
–
–
–
–
Behavioral Variability versus Stereotypy
Detrimental Effects of Rewards
Relevance, or Belongingness
Behavior Systems and constraints on Instrumental Conditioning
Behavioral Variability versus Stereotypy
• Behavioural Stereotypy vs Behavioural Variability
– Stereotyped responding
– Thorndike - "stamping in“ the behavior
– Skinner - "reinforced“ behavior
• Although reinforcement often leads to repetitive, even stereotyped
responding, this is not a necessary outcome.
• Does instrumental conditioning produce a stereotyped response?
– Depends on the training requirements
– Neuringer 1985, required variability as part of the instrumental response
• Draw rectangles to get points
• One group unknowingly got points for variable rectangle shapes
• Yoked group got points when ever the variable group got points
– See Figure 5.8
FIGURE 5.8
Degree response variability along three dimensions of drawing a rectangle (size, shape, and location)
for human participants who were reinforced for varying the type of rectangles they drew (VARY) or
received reinforcement on the same trials but without any requirement to vary the nature of their
drawings (YOKED). Higher values of U indicate greater variability in responding (based on Ross &
Neuringer, 2002).
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
Relevance, or Belongingness
• Certain responses naturally ‘belong with’ the reinforcer because of
the animal’s evolutionary history
• Thorndike used belongingness to explain why he
– could not train yawning or scratching
– could easily train latch operation or string pulling
• Just like all CSs are not equally associable with all USs, not all
responses are equally conditioned with all reinforcers
• Breland & Breland: “The Misbehavior of Organisms”
– animal training for entertainment such as drop a coin in a bank
– Instinctive drift
• gradual movement away from the operant behaviour that produces the
reinforcer
• Behavior becomes more species-typical or "instinctive" responses related to the
reinforcer or other stimuli in the experimental situation
drop a coin in a bank
Behavior Systems and constraints on Instrumental Conditioning
• Behavior systems covered in chapter four
–
–
–
–
Example of wolves foraging
Hungry animals engage in food related behaviors
Most instrumental conditioning is with hungry animals
Each type of animal has a specific set of foraging behavior
• Birds peck
• Rats handle food pieces
• Shettleworth (1975)rewarded a number of different behaviors such
as digging or face-washing with food reward with hamsters.
– some responses are more relevant to food reward than others
– behavior such as digging increase the chances of coming in contact with food
– face-washing does not increase the chances of coming in contact with food;
may even interfere with food-related behaviors
The Instrumental Reinforcer
• Characteristics of the reinforcer influence learning instrumental
responses
• Increases in the quantity or quality of the reinforcer can increase
the rate of responding
• Relationship between quantity of reinforcer and rate of responding
will not be linear
– There is subjective value that mediates the relationship
– Value of reinforcers and quantity of reinforcer are not the same thing
• Quality of reinforcer is difficult to control or measure
– Sweetness of food
– Verbal praise
• Quality and quantity of reinforcers interact
Quantity and Quality of Reinforcer
• In runway experiments animals will run faster for "bigger" reward
– Although this outcome is as expected
• the increase in responding across the increase in quantity of reinforcer is not a
linear
• nor is it linear for change in quality of the reinforcer
• and one would expect quantity and quality to interact
• For operant procedures
– quantity and quality of reinforcer will influence amount of responding
• increase in responding with increased quantity and quality
• However, getting a large quantity of food with one response will not maintain
responding
• OR getting a very small quantity of food for many responses will not maintain
responding
Quantity and Quality of Reinforcer
• Study with autistic child using social attention as a reinforcer
– magnitude of the reinforcer varied by duration of social attention
• Either 10, 105 or 120 seconds
– Required to press a button to get attention
• number of responses required to get the reinforcer changed progressively from 1
through 40
• In progressive ratio: need to complete the lower ratio before moving to the next
highest ratio
• See Figure 5.9
• Magnitude of the reinforcer in drug abstinence programs
– drug addicts given vouchers for not using drugs
– larger voucher amounts was more effective in reducing drug use
– getting the voucher immediately after the test for $10 or more was most
effective
FIGURE 5.9
Average number of reinforcers earned by Chad per session as the response requirement was
increased from 1 to 40. (The maximum possible was two reinforcers per session at each response
requirement.) Notice that responding was maintained much more effectively in the face of increasing
response requirements when the reinforcer was 120 seconds long [from Trosclair-Lasserre et al.
(2008), Figure 3, p. 215].
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
Shifts in Reinforcer Quality or Quantity
• Responding to a particular reinforcer also depends on past experience
with reinforcers
• Past experience with reinforcers results in expectations with regard to
Response (R) - Outcome (O) contingencies
• Behavioral Contrast
– Positive behavioral contrast
• Shift from small to large reward
– Negative behavioral contrast
• Shift from large to small reward
• Ortega study
– Shifted from 32% sucrose to 4% sucrose
– Under respond to the 4%
– See figure 5.10
– Anticipatory negative contrast
• Drugs of abuse such as cocaine are reinforcing
– will produce conditioned place preference
– however will not produce conditioned drinking of sweet solution
– drinking something sweet before receiving cocaine is a negative contrast effect
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.