* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Burrhus Frederic Skinner - Back
Social Bonding and Nurture Kinship wikipedia , lookup
Social psychology wikipedia , lookup
Bullying and emotional intelligence wikipedia , lookup
Classical conditioning wikipedia , lookup
Prosocial behavior wikipedia , lookup
Behavioral modernity wikipedia , lookup
Observational methods in psychology wikipedia , lookup
Symbolic behavior wikipedia , lookup
Abnormal psychology wikipedia , lookup
Social perception wikipedia , lookup
Organizational behavior wikipedia , lookup
Insufficient justification wikipedia , lookup
Parent management training wikipedia , lookup
Neuroeconomics wikipedia , lookup
Thin-slicing wikipedia , lookup
Attribution (psychology) wikipedia , lookup
Transtheoretical model wikipedia , lookup
Theory of planned behavior wikipedia , lookup
Applied behavior analysis wikipedia , lookup
Theory of reasoned action wikipedia , lookup
Descriptive psychology wikipedia , lookup
Sociobiology wikipedia , lookup
Psychological behaviorism wikipedia , lookup
Social cognitive theory wikipedia , lookup
Verbal Behavior wikipedia , lookup
Behavior analysis of child development wikipedia , lookup
Burrhus Frederic Skinner (1904 - 1990) Chapter 5 1 Burrhus Frederic Skinner 1. Born Mar. 20, 1904 Susquehanna, Pennsylvania. 2. Did his PhD (1931) from Harvard. 3. Wanted to become a writer was disappointed to learn that he had nothing to write about, instead became a great psychologist. 1904-1990 2 Burrhus Frederic Skinner www.simplypsychology.pwp.blueyonder.co.uk 4. Wrote The behavior of organisms (1938). Walden two (1948), after Thoreau’s Walden. 5. Taught at University of Minnesota (1936-48). 6. Chair at Indiana University (1945/48). 7. Came back to Harvard (1948-90). 1904-1990 3 1 Burrhus Frederic Skinner 8. Beyond freedom and dignity (1971). 9. images-cdn01.associatedcontent.com About behaviorism (1976). 10. Upon further reflection (1987). 11. Continued to publish to the end of his life in journals like Analysis of Behavior (1989). 1904-1990 4 Burrhus Frederic Skinner 12. Great contributions to learning and education. 13. Contributions to child development. 14. Project ORCON (ORganic CONtrol). 15. Died in 1990. Project ORCON pavlov.psicol.unam.mx:8080 5 1904-1990 Comparison Operant Conditioning Respondent Conditioning Skinnerian or operant conditioning Type R conditioning reinforcing stimulus is contingent upon a response Classical, Pavlovian, or respondent conditioning Type S conditioning reinforcing stimulus is contingent upon a stimulus S R S (Food) S S (Food) R 6 2 Comparison Continued Operant Conditioning Respondent Conditioning Responses are emitted to Responses are elicited to a known reinforcer. a known stimulus. Conditioning strength Conditioning strength = Rate of response = Response magnitude 7 Theoretical Differences Functionalists Associationists Edward Thorndike Burrhus Skinner Concentrated on responses as they brought about consequences. Ivan Pavlov Edwin Guthrie S R S Concentrated on stimuli as they brought responses. S S R 8 Radical Behaviorism 1. Behavior cannot be explained on the basis of drive, motivation and purpose. All of these take psychology back to its mentalistic nature. 2. Behavior has to be explained on the basis of consequences (reinforcements, punishments) and environmental factors. This, Skinner proposed, was the back bone of all scientific psychology. 9 3 Principles of Operant Learning 1. We need to know what is reinforcing for the organism. How can we find a reinforcer? It is merely a process of selection, which is difficult to determine. Reinforcers related to bodily conditions are easy to determine, like food and water. 2. This reinforcement will predict response. 3. Reinforcement increases rate of responding. 10 Operant Chambers Skinner devised operant chambers for rats and pigeons to study behavior in a controlled environment. Operant chambers opportunities to control reinforcements and other stimuli. 11 Magazine Training 1. At the beginning of this training the rat is deprived (a procedure) of food for 23 hours, and placed in the operant chamber. 2. The experimenter presses a hand held switch which makes a clicking sound (secondary reinforcer) and a food pellet (primary reinforcer) drops in the food magazine. 3. The rat learns to associate the clicking sound with the food pellet. 12 4 Magazine Training 4. To train the rat to come to the food magazine and eat food, the experimenter presses the switch when the rat is near the food magazine. After a few trials the rat associates clicking sound with coming of the food, and stays close to the magazine to eat food. Lever Food Pellet Food Magazine 13 Shaping 1. To train the rat to press the lever and get a food, the experimenter shapes rat’s behavior. Shaping involves reinforcing (secondary) rat for behaviors that approximate the target behavior, i.e., coming closer and closer to the lever and finally pressing it. This procedure is called successive approximation. 2. To shape lever-pressing behavior, differential reinforcement can also be used. In this procedure only lever-pressing behaviors are reinforced not others. 14 Cumulative Responses Cumulative Recording Second Response Operant Level Paper Movement One Response Time 15 5 Cumulative Responses Responding Rate Slow rate of responding Rapid rate of responding Shallow Steep trace trace Time 16 Cumulative Responses Cumulative Responses: Sniffy 75 Responses 75 Responses 75 Responses 17 Extinction S Lever R Lever pressing response S Food Remove reinforcement (food) and the lever pressing behavior is extinguished. 18 6 Cumulative Responses Extinction No Food Extinction (Operant Level) Time 19 Spontaneous Recovery Behavior (Cumulative Responses) Just as we have spontaneous recovery in classical conditioning, a restful period after extinction initiates lever-pressing response in the animal. 60 50 Spontaneous Recovery 40 30 Extinction & Rest 20 10 0 5 10 15 20 25 30 20 Trials Discrimination Learning The organism can be conditioned to discriminate between two or more stimuli. A discriminative operant is a response that is emitted specifically to one stimulus (SD) but not the other (SΔ). Discriminative Stimulus Response Reinforcement Light ‘ON’ (SD) Press lever Food Light ‘OFF’ (SΔ) Lever not pressed No Food 21 7 Secondary Reinforcement “Any neutral stimulus paired with a primary reinforcer (e.g., food or water) takes on reinforcing properties of its own" (Hergenhahn and Olson, 2001)” and is called a secondary stimulus. Thus, all discriminative stimuli are secondary reinforcers. 22 Generalized Reinforcers 1. A secondary reinforcer can become a generalized reinforcer when paired with a number of primary reinforcers. Money then is a generalized reinforcer, for it is associated with primary reinforcers like food, drink and mates. 2. Secondary reinforcer is similar to Allport’s (1961) idea of functional autonomy. First there is activity for reinforcement, but then the activity by itself becomes reinforcing, e.g., joined merchant navy for money but now enjoys sailing for its own sake. 23 Chaining A discriminative stimulus (S D) initiates a response (SR) which serves as a stimulus (S D) for the next response (SR) and so on till the final response (R) is followed by primary reinforcement. SD R SD SR Many stimuli Orients Sight of lever R SD SR Approaches Contact lever R SR Presses bar Food Pellet Similar to Guthrie’s movement-produced stimuli. 24 8 Reinforcement & Punishment If response is followed by a reinforcer then the response increases. However, if it is followed by a punisher then the response decreases. 25 Reinforcement Reinforcer Contingency Example Behavior Doing work getting food Studying books getting good grades Work increases Primary Positive Secondary Positive Primary Negative Heater proximity avoids cold Secondary Negative Waking early avoiding traffic Studying increases Heater proximity increases Waking early increases26 Punishment Punisher Contingency Primary Positive Secondary Positive Primary Negative Secondary Negative Example Behavior Work with electricity get shock Insult boss get reprimanded Work with electricity decreases Insulting boss decreases Quarrelsome Quarrelsome behavior behavior lose food decreases Coming home late Coming late no going out decreases 27 9 Consequences & Contingencies Contingency Positive Negative Reinforcement Behavior increases Behavior increases Punishment Behavior decreases Behavior decreases Consequence Like Thorndike, Skinner believed that positive reinforcement strengthened behavior but punishment did not weaken behavior. 28 Estes’s Punishment Experiment 500 Cumulative Responses No reinforcement + punishment 400 No reinforcement 300 200 100 0 1 2 3 Extinction Session 29 Punishment 1. Unwanted emotional byproducts (generalized fears). 2. Conveys no information to the organism. 3. Justifies pain to others. 4. Unwanted behaviors reappear in its absence. 5. Aggression towards the agent. 6. One unwanted behavior appears in place of another. 30 10 Punishment Why punishment? It reinforces the punisher! Alternatives to Punishment 1. Do not reinforce the unwanted behavior. 2. Let the individual engage in the undesirable behavior for long till he is sick of it. 3. Wait for the unwanted behavior to dissolve over development. 31 Schedule of Reinforcement A. When a response is always followed by reinforcement it is called continuous reinforcement. Such a response after learning is easy to extinguish. B. When occurrence of reinforcement is probabilistic it is termed as partial reinforcement, and is difficult to extinguish. During partial reinforcement superstitious behaviors arise. An animal behaves peculiarly to get reinforcement, when its not being received. 32 Ratio Schedules 1. Reinforcement that occurs after every nth response is called fixed ratio schedule. For example, when the rat presses the bar 5 times to get food, it is on FR5 schedule. 2. Reinforcement occurs after an average of n responses is known as variable ratio schedule. Sometimes the reinforcement is introduced after 3 bar presses at other times 8 bar presses, however, the average bar presses equals 5. Abbreviated as VR5. 33 11 Interval Schedules 3. When reinforcement occurs after a specified interval of time is called fixed Interval schedule. Animal gets food after 5 seconds. Abbreviated as FI5. 4. When reinforcement occurs after an average interval of time is called variable Interval schedule. Some times the rat gets the food pellet after 3 seconds and some times after 8 seconds however the average time interval equals 5 seconds (VI5). 34 Schedules of Reinforcement Different learning curves emerge with different reinforcement schedules. For ratio schedules they are steeper than interval schedules. Sequence Fixed Variable Ratio Domain Interval 35 Concurrent Schedules VI5 VI10 Behavior (Cumulative Responses) 5. Concurrent schedules provide two simultaneous schedules of reinforcements, organisms (pigeons) will distribute their responses according to these schedules (Skinner, 1950). 5 VI 5 4 VI 10 3 2 1 0 1 5 10 15 20 25 30 36 Time (Minutes) 12 Herrnstein Matching Law Herrnstein (1970; 1974) showed with a mathematical equation that relative reinforcement equals relative response (behavior). 1.0 B1 B1 + B2 = R1 R1 + R2 0.9 Relative Behavior Red Key Red Key Green Key 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 37 Relative Reinforcement Red Key Simple Choice behavior Gratification from rewards can be immediate or delayed. Our simple choice behaviors are dictated by these reinforcements accordingly. Delayed Reward Study Going to the movies Gratification with a good grade Immediate Reward Gratification by seeing a movie 38 Concurrent Chain Schedule 6a. Concurrent chain schedule produce complex choice behaviors so under one condition pigeons preferred small sooner reinforcer (Rachlin & Green, 1972). Light Delay 2 sec 6 seconds 4 seconds difference Reinforcement 2 sec of grain 6 sec of grain 39 13 Concurrent Chain Schedule 6b. And in the other condition, pigeons preferred large delayed reinforcers (Rachlin & Green, 1972). Light Delay Reinforcement 20 seconds 24 seconds 2 sec of grain 6 sec of grain 4 seconds difference 40 Complex Choice Behavior Thus organisms (human and animal) behave differently to different rewards. Selection of rewards in a complex choice situation is based on a combination of reward imminence (how large or small they are) and reward delay (length of time to reach them). 41 Progressive Ratio Schedule 7a. Progressive ratio schedule provides a tool to measure the efficacy of a reinforcer. To determine whether one reinforcer is more effective than the other, progressive ratio schedule requires the organism to indicate in behavioral terms the maximum it will “pay “ for a particular reinforcer. 42 14 Progressive Ratio Schedule 7b. The organism is trained on a fixed ratio schedule say FR2 and receives say 5 pellets of food. The schedule is increased to FR4, so now the animal makes 4 responses before it gets 5 pellets of food. The schedule is increased to FR8 and so on. There comes a time for a schedule (FR64) that the animal is not willing to engage in responses to get the reinforcement. 43 Progressive Ratio Schedule 7c. We can compare two reinforcements (food and water) and determine at which schedule the animal breaks down for them, thus comparing their efficacy. Food breaks down before water. Mean Log Reinforcement Rate 16 Reinforcement A (Food) 14 Reinforcement B (Water) 12 10 8 6 4 2 0 0 1 2 4 8 16 32 Log FR Schedule 64 128 256 512 44 Verbal Behavior Like any other behavior language (verbal behavior) is also a behavior and largely consists of speaking, listening, writing and reading behaviors. These behaviors are governed by antecedent conditions (stimuli), and consequences (reinforcements). 45 15 Types of Verbal Behavior 1. Mand (from demand or command): A listening or talking behavior. The individual (child) behaves appropriately to the command given by another (adult) and is reinforced. The child may also request (demand) something to relieve a need. The adult says, “look (mand) I have a toy for you”. The child looks (behaves) and is reinforced with the toy (reinforcement). 46 Types of Verbal Behavior 2. Echoic Behavior: A talking behavior. A word or a sentence repeated verbatim. Can be loud or silent as in reading. The adult says “cookies” (stimulus) the child echoes the word (behavior) and gets a smile (reinforcement). Cookies Cookies Audible Silent 47 Types of Verbal Behavior 3. Tact: A talking behavior. A verbal behavior in which individuals correctly names or identifies (tact) objects (stimuli) and the other individuals reinforce them for a correct match. Flowers Good 48 16 Types of Verbal Behavior 4. Autoclitic Behavior: A talking behavior. This behavior (autoclitic) occurs when a question (stimulus) is posed. The answer to the question is followed by reinforcement (praise). Also called intraverbal behavior. Which mammal lives in the sea? A whale! 49 ABC of Verbal Behavior Type Antecedent (A) Behavior (B) Consequence (C) Mand State of Deprivation or aversive stimulation Verbal utterance Reinforcer that reduces state of deprivation Echoic Verbal utterance from another individual Repetition of what the speaker says Conditioned reinforcement (praise) from the other person Tact Stimulus (usually object) in the environment Verbal utterance naming or referring to the object Conditioned reinforcement from the other person Autoclitic Verbal utterance (often a question) from another person Verbal response (answer to a question) Verbal feedback or reinforcement Based on Skinner (1957) 50 Programmed Learning Skinner was interested in applying theory of learning to education, therefore introduced teaching machines. Electromechanical devices that promoted teaching and learning. upload.wikimedia.org 51 17 Programmed Learning 1. Teaching machines provide sustained activity. 2. Insures a point is understood before moving on (small steps). 3. Presents learner with material he is ready for. 4. Helps learner find the right answer. 5. Provides immediate feedback. 52 Learning Theory & Behavior Technology 1. Skinner did not believe in formulating a theory of learning, the way Hull did. 2. Behavior should be explained in terms of stimuli, not physiology. 3. Functional analysis of stimuli and behaviors should be the goal of psychology not the “why of behaviors”. 4. We need behavior technology to resolve human problems. But our culture, government and religion erodes reinforcements to problem-free 53 behaviors. David Premack 1. Born: October 26, 1925, Aberdeen, South Dakota. 2. Started working at the Yerkes Primate Biology Laboratory (1954). 3. Intelligence in Apes and Man (1976). The Mind of an Ape (1983). Original Intelligence: The Architecture of the Human Mind (2002). 1925-Present 54 18 David Premack 4. Emeritus professor of psychology at the University of Pennsylvania. 5. William James Fellow Award (2005). 1925-Present 55 Premack Principle Responses (behaviors) that occurred at a higher frequency could be used as reinforcers for responses that occurred with low frequency. In other words High-probability behavior (HPB) can be used to reinforce low-probability behavior (LPB). Eating (HPB) Grooming (HPB) Grooming (LPB) In order to increase grooming behavior (LPB), eating behavior (HPB) was used as a reinforcer. Each time the animal groomed, it was given the opportunity to eat. His grooming behavior increased. Proportion of behavior in the animal 56 Relativity of Reinforcement To test his theory in humans, Premack took 31, 1st graders and gave them gumball and pinball machine to play with. Based on their activity he was able to classify them into eaters and manipulators. Phase I Gumball machine Pinball Machine 57 19 Relativity of Reinforcement Phase II If the child was an eater, he was only allowed to eat if he played the pinball machine. If the child was a manipulator, he was only allowed to play if he ate from the gumball machine. Playing behavior increased! Eating behavior increased! 58 Transituational Nature of Reinforcement A high probability behavior like eating will become a low probability behavior if the animal eats. Not only does the probability of the behavior changes, but the very nature of the reinforcement changes with time. Food Rewarding Neutral Punishing Nature of reinforcement over time (Kimble, 1993). 59 Disequilibrium Hypothesis Timberlake (1980) suggests that any activity can become a reinforcer if the activity is blocked in some way. If drinking is blocked a state of disequilibrium is produced in the animal, and now can be used as a reinforcer. State of Disequilibrium 30% 20% 10% Eating 10% Drinking Activity Wheel 60 20 Marian Breland Bailey 1. Born Dec. 2, 1920 in Minneapolis, Minnesota. 2. Became the second PhD student under Skinner moved to Hot Spring and relocated Animal Behavior Enterprises (ABE). 3. Studied functional analysis of behavior and taught at Henderson State University. 4. Died Sep. 25, 2001. 1925-2001 61 Instinctive Drift When instinctive behavior comes in conflict with conditioned operant behavior, animals show a tendency to drift in the direction of instinctive behavior. Marian Breland and Keller Breland trained raccoons to put a wooden coins in a box (commercial for a saving bank) but raccoons had trouble putting the coins in the box especially, when there were two coins to deposit. Brelands argued that raccoons instinctive behavior of washing (rubbing) the food before eating came in conflict with the learnt behavior. 62 Questions 17. Would you use the same reinforcers to manipulate the behavior of both children and adults? If not what would make the difference? 18. What is partial reinforcement effect? Briefly describe the ratio and interval reinforcement schedules studied by Skinner. 19. Explain the difference between Premack’s and Timberlake’s views of reinforcers. 63 21