Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Applied behavior analysis wikipedia , lookup
Attribution (psychology) wikipedia , lookup
Psychological behaviorism wikipedia , lookup
Verbal Behavior wikipedia , lookup
Solution-focused brief therapy wikipedia , lookup
Insufficient justification wikipedia , lookup
Behavior analysis of child development wikipedia , lookup
Descriptive psychology wikipedia , lookup
Behaviorism wikipedia , lookup
LEARNING SUMMARIES – OPERANT CONDITIONING In Classical Conditioning we learn to anticipate that important events are about to take place by coming to recognize that these important events are often predicted by other events. This helps us prepare for these crucial times. But, obviously, we need to do more than get ready for threats or the chance to eat or reproduce. We need to know how to avoid threats and find food and chances to reproduce. The ability to do things which make us safer, help us find things to eat and drink, and get close to other people develops through another type of learning called operant conditioning. Through this process we actively make choices and do things which result in us getting the things we need whether food, fun, sex, or money. The most important concept in operant conditioning is reinforcement, an event which increases the chance that the action which came before it will happen again. For example, if after every time you walk into IVCC and say hello to the first person you see you get $50, you will likely say hello to the first person you see when you walk into IVCC. The $50 reinforces the action/behavior which came just before you received it saying hello. If getting the $50 makes you feel good, but doesn’t make you say hello the next time you walk into IVCC, it isn’t reinforcement. In other words, reinforcement must have an effect on your behavior. This concept was first described by Edward Thorndike over 100 years ago. He originated the Law of Effect which holds that we are more likely to repeat actions which lead to (what we view as) favorable outcomes. In other words, if you are hungry and you learn that asking your Mom for a sandwich will get you one, you are very likely to ask her for one. Similarly, if you know that your coach will play you more (a favorable outcome) if you play good defense, you are more likely to play good defense so you can play more. Remember that you are actively doing something, playing defense or asking for a sandwich, to get something you want. In classical conditioning you never got the chance to do anything, certain things were going to happen and it helped you get ready for them but you were powerless to control, increase or stop them. In operant conditioning though you are doing things that change what you will experience. While Thorndike was the first to discuss these ideas in the study of behavior, it was a psychologist named B. F. Skinner in the 1930’s and 1940’s who really described and explained how operant conditioning works both with regard to you and I as well as birds, mammals, even fish. B. F. SKINNER & the BASICS of OPERANT CONDITIONING Skinner is called “the father of operant conditioning”. He invented the most famous way to study it, putting rats and pigeons in a small metallic box he called an “operant chamber”. Everyone else calls it a “Skinner box”. In this he would explore the laws of reinforcement (and punishment) by giving rats little pellets of food if they would pull down on a bar or pigeons pellets if they would peck a designated circle. Since both the pigeons and the rats were quite hungry they would do anything for the pellets, the pellets were powerful reinforcers because the animals would repeat any actions which caused the delivery of the pellets. But how did Skinner get the rats/pigeons to press on the bar or peck the circle in the first place? Neither animal engaged in either action when first placed in the Skinner box. He did this by reinforcing actions which came close (and then closer) to the desired behavior, a process he named shaping. For example, when the rat first was placed in the box he only gave it a pellet when it was in the half of the box where thee lever was located. This reinforced that choice by the rat, to stay in the portion of the box close to the lever. Soon the hungry rat spent all of its time in the half of the box near the lever. Next Skinner only gave the rat a pellet if it was in the quarter of the box near the lever, then when the rat was next to the lever, then only when the rat faced the lever. Soon a pellet was only given when the rat raised its paw next to the lever, then only when it touched the lever, finally only when its paw pulled the lever down. Thus, gradually, Skinner shaped the rat’s behavior, causing it to do something it would not have done without the consistent delivery of the reinforcement. Skinner also taught animals complex behaviors through a process he called chaining. Please look in your text for a detailed description. Through these processes Skinner could teach an animal to perform amazing tricks. These concepts are now used every day, everywhere, to change the behavior of animals and humans whether in zoos, schools, prisons or homes. Another way to change behavior is through punishment anything which will lessen the chance that the action/behavior which came before it will happen again. For example, imagine if Larisha makes a bad pass and Coach Crick pulls her out of the game, providing a consequence. If this decreases the odds that Lrisha will throw a bad pass in the future, the consequence - pulling her out of the game, is punishment. If Larisha continues to make bad passes, even if Coach Crick takes her out, then pulling her out of the game is not punishment, even if Larisha doesn’t like it. Remember, consequences have to change future behavior patterns to earn the labels punishment or reinforcement. Punishment can also occur when a painful consequence follows an action or behavior, as long as the painful consequence stops us from engaging in the behavior in the future. For example, walking off trails and through weeds in the summer can cause me to suffer from the severe skin rashes and itching caused by poison ivy. These consequences (itching and rashes) keep me from walking off the trails. Since they make this behavior less likely to occur, they are punishment for me. Also, reinforcement can arise from removing a condition we don’t like. For instance, if taking a pill makes pain or sadness go away, we will probably take the pill next time we are in pain or sad. If we do take the pill when in pain or sad, then we are looking at reinforcement because the frequency of the behavior has increased because of its consequence – taking the pain or sadness away. Picking Reinforcers To use the principles of operant conditioning to change behavior we need to find effective reinforcers. This is not as easy as it sounds. Different people are influenced by different things at different times. We will do something for food at noon, but maybe not at 7 a.m... We like to play basketball usually, but maybe not as much after a three hour practice. How can we select the right reinforcer? There are two ways. David Premack thought that we can pick an appropriate reinforcer by looking to see what people like to do. For example, he would children do when ever they had free time. He would count how much time they spent doing various things. Perhaps the child spent 90% of his time playing video games and just 10% reading. If we want to increase the time the child spends reading, we should only let the child play video games if he has already spent some time reading. This strategy is called the Premack Principle using the chance to engage in more common behaviors (in this case, playing video games) as reinforcement for performing less common behaviors (reading). Another strategy claims that allowing someone the chance to return to typical routines will be reinforcing. In other words, if we prevented the child from reading (part of his typical routine) for long enough, he would clean his room for the chance to read. This is called the Disequilibrium Principle. Also, some reinforcers will affect our behavior from the moment we are born such as food, water, and affection. These are called unconditional reinforcers because we don’t have to learn of their value. For some reinforcers though, we have to learn their value. These are called conditional reinforcers and we eventually learn that they can help us get access to unconditional reinforcers. The best example of such a reinforcer is, of course, money since it can get us food, water, etc. MORE OPERANT CONDITIONING CONCEPTS Generalization – if a behavior works for us (we receive reinforcement or avoid punishment) in a certain situation, we are likely to repeat the action in a similar situation. For example, if you scored a lot of points with your jump hook when you played in Turkey, you are likely to use it here at IVCC. This response or behavior (your jump hook) has generalized to a new situation. Discrimination – if you can use a cross-over dribble to get to the basket and score points in Rockford, you are likely to try your cross-over here at IVCC. However, if you discover that opposing players are consistently stealing the ball from you when you try your crossover, you will probably stop using it here. You have learned to discriminate between the two different situations. For example, if you tell a joke to some of your teammates and they laugh, you will probably be encouraged to tell the same joke to the rest of your teammates. That behavior (telling the joke) has spread or generalized. However, you might know that your parents, minister, Imam or Mullah would not think that joke was funny so you would not share it with them. You have learned to discriminate between the differing situations based upon the likely consequences of telling the joke. Extinction – if we stop receiving reinforcement after performing actions which were formerly reinforced, we will probably eventually stop performing the actions. When behaviors stop because they are no longer reinforced, we say that the behavior has gone through extinction. But behaviors depending on how they were originally reinforced go through extinction in different ways. To understand these important differences we must carefully describe how reinforcement is administered through various reinforcement schedules. Reinforcement Schedules Continuous reinforcement – if reinforcement follows every time the correct behavior is performed, we have a continuous reinforcement schedule. For example, every time we place money in a soda machine we expect to get a soda in return. Also, if we work we expect to get paid every week, two weeks, or twice a month. Intermittent reinforcement – Most of the time, we don’t get reinforcement every time we do something. People don’t always say “Thank you”, when we help them or smile back every time we smile at them. Often we know this, that we will have to make a number of responses (a ratio) or wait some amount of time (an interval) before we get our reinforcement. There are four types of intermittent reinforcement schedules: 1) Fixed ratio: a certain, constant number of responses have to be made before reinforcement is received. Migrant workers must pick a bushel full of peaches before they get paid, not just one peach. A college student must complete a semester of class work before getting credit for the class, not some small amount of credit for every class he/she attends. 2) Variable ratio: an unknown, varying number of responses have to be made before reinforcement follows. When we play a slot machine we don’t know how many times we’ll have to play to hit the jackpot. It might be 5, it might be 500. 3) Fixed interval: we have to wait a certain, consistent amount of time before we get reinforced. In most jobs, we don’t get paid every time we work, we have to wait till Friday, or every other Friday, or maybe twice a month. 4) Variable ratio: we have to wait an unknown amount of time before reinforcement follows. We’re never really sure how long we’ll have to wait before our favorite band comes out with their next cd or when our favorite team will win their next championship so we just wait while still going to their games or rooting for them while we watch on TV. Extinction under Intermittent Reinforcement If we have been continuously reinforced for performing an action in the past, we will quickly stop making the response if reinforcement ends. If the soda machine doesn’t’ give you a Pepsi or Coke after you put in your 65 cents, you won’t put in another 65 cents. However, if you have been reinforced through an intermittent schedule, you won’t stop responding if you don’t receive reinforcement. Behaviors reinforced intermittently go through extinction much more slowly, if go extinct at all. How many people do we know who bet on one of the lotteries every week (or more) though they rarely, if ever win (receive reinforcement)?