* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Module 19 Operant Conditioning Operant Conditioning
Bullying and emotional intelligence wikipedia , lookup
Symbolic behavior wikipedia , lookup
Learning theory (education) wikipedia , lookup
Prosocial behavior wikipedia , lookup
Observational methods in psychology wikipedia , lookup
Behavioral modernity wikipedia , lookup
Psychophysics wikipedia , lookup
Abnormal psychology wikipedia , lookup
Transtheoretical model wikipedia , lookup
Thin-slicing wikipedia , lookup
Neuroeconomics wikipedia , lookup
Attribution (psychology) wikipedia , lookup
Theory of planned behavior wikipedia , lookup
Theory of reasoned action wikipedia , lookup
Counterproductive work behavior wikipedia , lookup
Sociobiology wikipedia , lookup
Applied behavior analysis wikipedia , lookup
Parent management training wikipedia , lookup
Descriptive psychology wikipedia , lookup
Verbal Behavior wikipedia , lookup
Classical conditioning wikipedia , lookup
Behavior analysis of child development wikipedia , lookup
Social cognitive theory wikipedia , lookup
Psychological behaviorism wikipedia , lookup
Insufficient justification wikipedia , lookup
10/17/2013 Module 19 Operant Conditioning © 2013 Worth Publishers Operant Conditioning Operant conditioning involves adjusting to the consequences of our behaviors. Response: balancing a ball How it works: An act of chosen behavior (a “response”) is followed by a reward or punitive feedback from the environment. Results: Rewarded behavior is more likely to be tried again. Punished behavior is less likely to be chosen in the future. Consequence: receiving food Behavior strengthened 1 10/17/2013 Thorndike’s Law of Effect Edward Thorndike placed cats in a puzzle box; they were rewarded with food (and freedom) when they solved the puzzle. Thorndike noted that the cats took less time to escape after repeated trials and rewards. Thorndike’s law of effect: behaviors followed by favorable consequences become more likely, and behaviors followed by unfavorable consequences become less likely. B.F. Skinner: Behavioral Control & The Operant Chamber B. F. Skinner extended Thorndike’s principles much more broadly. B. F. Skinner, like Ivan Pavlov, pioneered more controlled methods of studying conditioning. Bar or lever that an animal presses, randomly at first, later for reward Food/water dispenser to provide the reward B.F. Skinner trained pigeons to play ping pong, and guide a video game missile. http://youtu.be/vGazyH6fQQ4 2 10/17/2013 Reinforcement Reinforcement: feedback from the environment that makes a behavior more likely to be done again. Positive + reinforcement: the reward is adding something desirable Negative reinforcement: the reward is removing something unpleasant This meerkat has just completed a task out in the cold For the meerkat, this warm light is desirable. Operant Effect: Punishment Punishments have the opposite effects of reinforcement. These consequences make the target behavior less likely to occur in the future. + Positive Punishment You ADD something unpleasant/aversive (ex: spank the child) - Negative Punishment You TAKE AWAY something pleasant/ desired (ex: no TV time, no attention)-MINUS is the “negative” here Positive does not mean “good” or “desirable” and negative does not mean “bad” or “undesirable.” 3 10/17/2013 Consequence matrix Stimulus Type Add a Stimulus Remove a Stimulus Appetitive Stimulus (Something desired) Aversive Stimulus (something not desired) 7 Reinforcement Types A primary reinforcer is a stimulus that meets a basic need or otherwise is intrinsically desirable, such as food, sex, fun, attention, or power. A secondary/conditioned reinforcer is a stimulus that has become associated with a primary reinforcer. 4 10/17/2013 A Human Talent: Responding to Delayed Reinforcers Dogs learn from immediate reinforcement; a treat five minutes after a trick won’t reinforce the trick. Delayed Reinforcer: A reinforcer that is delayed in time for a certain behavior. A paycheck that comes at the end of a week. Humans have the ability to link a consequence to a behavior even if they aren’t linked sequentially in time. However, We may be inclined to engage in small immediate reinforcers (watching TV) rather than large delayed reinforcers (feeling alert tomorrow) Delaying gratification, a skill related to impulse control, enables longer-term goal setting. http://youtu.be/jQvBrEEYS20 How often should we reinforce? Reinforcement Schedules In continuous reinforcement (giving a reward after the target every single time), the subject acquires the desired behavior quickly. In partial/intermittent reinforcement (giving rewards part of the time), the target behavior takes longer to be acquired/established but persists longer without reward. 5 10/17/2013 Different Schedules of Partial/Intermittent Reinforcement We may schedule our reinforcements based on an interval of time that has gone by. Fixed interval schedule: Every so often Variable interval schedule: Unpredictably often We may plan for a certain ratio of rewards per number of instances of the desired behavior. Fixed ratio schedule: Every so many behaviors Variable ratio schedule: After an unpredictable number of behaviors Which Schedule of Reinforcement is This? Ratio or Interval? Fixed or Variable? 1. 2. 3. 4. 5. 6. 7. Rat gets food every third time it presses the lever FR Getting paid weekly no matter how much work is done FI Getting paid for every ten boxes you make FR Hitting a jackpot sometimes on the slot machine VR Checking cell phone all day; sometimes getting a text VI Buy eight pizzas, get the next one free FR Fundraiser averages one donation for every eight houses VR visited 8. Kid has tantrum, parents sometimes give in VR FI 9. Repeatedly checking mail until paycheck arrives 6 10/17/2013 When is punishment effective? Punishment works best in natural settings when we encounter punishing consequences from actions such as reaching into a fire. In that case, operant conditioning helps us to avoid dangers. Punishment is less effective when we try to artificially create punishing consequences for other’s choices; Severity of punishments is not as helpful as making the punishments immediate and certain. Applying operant conditioning to parenting Problems with Physical Punishment Punished behaviors may simply be suppressed, and restart when the punishment is over. Instead of learning behaviors, the child may learn to discriminate among situations, and avoid those in which punishment might occur. Instead of behaviors, the child might learn an attitude of fear or hatred, which can interfere with learning. This can generalize to a fear/hatred of all adults or many settings. Physical punishment models aggression and control as a method of dealing with problems. 7 10/17/2013 Summary: Types of Consequences Adding stimuli Subtract stimuli Outcome Positive + Reinforcement (You get candy) Negative – Reinforcement (I stop yelling) Strengthens target behavior (You do chores) Positive + Punishment (You get spanked) Negative – Punishment (No cell phone) Reduces target behavior (cursing) uses desirable stimuli If the organism is learning associations between its behavior and the resulting events, it is... uses unpleasant stimuli Operant vs. Classical Conditioning operant conditioning If the organism is learning associations between events that it does not control, it is... classical conditioning 8 10/17/2013 Operant and Classical Conditioning are Different Forms of Associative Learning Operant conditioning: Classical conditioning: involves operant behavior, chosen behaviors which “operate” on the environment these behaviors become these reactions to associated with consequences unconditioned stimuli (US) which punish (decrease) or become associated with reinforce (increase) the neutral (thenconditioned) operant behavior stimuli There is a contrast in the process of conditioning. involves respondent behavior, reflexive, automatic reactions such as fear or craving The experimental (neutral) stimulus repeatedly precedes the respondent behavior, and eventually triggers that behavior. The experimental (consequence) stimulus repeatedly follows the operant behavior, and eventually punishes or reinforces that behavior. Associative and Cognitive Learning Associative Learning Classical conditioning: learning to link two stimuli in a way that helps us anticipate an event to which we have a reaction Operant conditioning: changing behavior choices in response to consequences Cognitive learning: acquiring new behaviors and information through observation and information, rather than by direct experience 9