Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
NSCI 5702 PRINCIPLES OF TRAINING LECTURE OUTLINE 1. 2. 3. 4. What is training? Why is it useful to train animals? How do animals learn? Training techniques used with animals WHAT IS TRAINING? “The shaping of an animal so that it behaves in a way that humans desire.” UFAW (1992) WHY IS IT USEFUL TO TRAIN ANIMALS? Husbandry and health purposes Safety of handler Treatment of problem behaviour Human assistance Entertainment/Education Enrichment TRAINING TECHNIQUES USED WITH ANIMALS 1. 2. 3. 4. Classical Conditioning Desensitisation Counter Conditioning Operate Conditioning I. II. III. IV. Shaping Positive Reinforcement Negative Reinforcement Combinations of positive and negative reinforcement 5. Flooding 6. Punishment To understanding how to train animals we must first have an understanding of how they learn – Learning Theory HOW ANIMALS LEARN HOW ANIMALS LEARN There are a number of dif ferent forms of learning. We will discuss the following four: 1. 2. 3. 4. Sensitisation Habituation Imprinting Associate Learning Classical Conditioning Operant Conditioning SENSITISATION “The increasing of a response to a repeated stimulus” (Broom and Fr a s e r : D o m e s t i c A n i m a l B e h av i o ur a n d We l f a r e , 2 0 07 ) The animal learns to respond to a stimulus Adaptive for survival. E.g. Gazelles reacting to the sound of a twig breaking (signals approaching predator?) E.g. A rat that has just experiences an aversive stimulus, such as a bright light will immediately afterwards be extra sensitive to other cues, such as noises or lights, that it would not normally respond to. DESENSITISATION A decrement in response that is produced by gradual exposure to a stimulus that elicits the response. Commonly used in training. E.g. Using a tape recording of a particular sound which a dog is fearful of. Tape is played very softly at first and only gradually increased in volumes at increments designed to elicit no response. HABITUATION The animal learns not to respond to irrelevant stimuli. This decline in response is specific to a given stimulus. Animals will not habituate to relevant stimuli e.g. those associated with predators, food or mates. Advantageous in that it saves energy that would be wasted on repeated response to trivial stimulus. E.g. Zoo animal habituate to the presence of visitors. E.g. Sheep habituate to the sound of passing traf fic. IMPRINTING Phase-sensitive learning that is rapid and apparently independent of the consequences of behaviour Phase sensitive learning = learning occurring at a particular age or a particular life stage. E.g. Chicks hatch with an innate tendency to approach and follow their mother. They have already imprinted on her vocalisations. After hatching (24-36hrs) they imprint on her visual appearance. Other young animals imprint on olfactory cues from their mothers. Also has an impact upon the animals future choice of a sexual partner http://en.wikipedia.org/wiki/File:Anas_platyrhynchos_ Boston_Harbor,_Massachusetts,_USA -_parent_and_chicks -8.ogv IMPRINTING ASSOCIATIVE LEARNING There are 2 types of associative learning: 1. Classical (Pavlovian) Conditioning 2. Operant (Instrumental) Conditioning CLASSICAL (PAVLOVIAN) CONDITIONING CLASSICAL (PAVLOVIAN) CONDITIONING When an animal learns to associate a conditioned stimulus (bell ringing) with an unconditioned stimulus (food) and eventually elicits a conditioned response (salivation) Definitions: Primary or Unconditioned Stimulus (US) = Stimuli that animals react to without training Secondary or Conditioned Stimulus (CS) = Stimuli that have been associated with a primary (unconditioned) Stimulus CLASSICAL (PAVLOVIAN) CONDITIONING When an animal learns to associate a conditioned stimulus (bell ringing) with an unconditioned stimulus (food) and eventually elicits a conditioned response (salivation) CLASSICAL (PAVLOVIAN) CONDITIONING CLASSICAL (PAVLOVIAN) CONDITIONING Examples: Knock on the door and bark – people Keys and run to you – leaving/car Tin opening and cat meowing – food Hay basket and squeaking GP’s food http://www.youtube.com/watch?v=WfZfMIHwSkU CLASSICAL (PAVLOVIAN) CONDITIONING Application In training to be used prior to a reinforcer E.g. a Bridge Clicker Whistle CLASSICAL CONDITIONING: EXTINCTION If the food is no longer presented with the bell, causing the dog to salivate less in response to the bell OPERANT CONDITIONING Thorndike (1898) put hungry cats into a ‘puzzle box’ with a lever mechanism that opened a door which lead to a food reward. OPERANT CONDITIONING Thorndike concluded that it was merely a process of trail and error. The box invoked a series of trial and error voluntary actions. The cat learnt to press the lever to escape (a rewarding experience). “a response that is followed by a reward is more likely to recur whereas one that is followed by an unpleasant experience is less likely to occur again? (laws of effect) Learning is the result of associations forming between stimuli and responses. Associations are weakened or strengthened by the nature and frequency of stimulus -response (S-R pairings) The animal learns to associate its own behaviour with a particular outcome. If the outcome is rewarding e.g. access to food, the animal learns to repeat the behaviour that resulted in food access previously. OPERANT CONDITIONING Operant conditioning is “the type of learning in which the probability of a behaviour recurring is increased or decreased by the consequences that follows. This includes positive/negative reinforcement and positive/negative punishment. Forms an association between a behaviour (voluntar y) and a consequence. REINFORCEMENT AND PUNISHMENT Definitions: Anything that increases a behaviour = Reinforcer Anything that decreased a behaviour = Punisher Consequences: 1. Something good can START or be presented = behaviour increases (Positive Reinforcement) 2. Something bad can END or be taken away = behaviour increases (Negative Reinforcement) 3. Something bad can START or be presented = behaviour decreases (Positive Punishment) 4. Something good can end or be taken away = behaviour decreases (Negative Punishment) REINFORCEMENT AND PUNISHMENT REINFORCEMENT AND PUNISHMENT Examples 1. Positive Reinforcement (R+) Adding something good to increase behaviour Food 2. Negative Reinforcement (R -) Removing something bad to increase behaviour Elephant training, horse reins 3. Positive Punishment (P+) Adding something bad to decrease a behaviour Shock collar, physical punishment 4. Negative Punishment (P -) Removing something good to decrease a behaviour Time out. REINFORCEMENT AND PUNISHMENT Definitions are based on their actual ef fect on the behaviour in question. They must reduced or strengthen the behaviour (to be defined as a punishment or reinforcer). Pleasures meant as rewards but that do not strengthen the behaviour are indulgences not reinforcement. Aversives meant as a behaviour weakener but which do not weaken behaviour are abuses, not punishment. REINFORCERS AND MOTIVATION TO LEARN Rewards mean dif ference things to dif ferent animals (e.g. food, toys, af fection, other animals). You must first establish what motivates the animal. This could be: Food Social contact Toys Praise clicker REINFORCERS Primary Reinforcer: (e.g. food) A stimulus or event that is inherently rewarding to the animal Secondary Reinforcer: (e.g. clicker) Initially meaningless stimuli or event becomes inherently rewarding after repeated association with primary reinforcer http://www.youtube.com/watch? v=hgDHWLyztCI&feature=related REWARDS Ti m ing i s eve r y t h ing! Th e rewa rd m us t o c c ur w i t h in 1 - 2 s e c on ds o f t h e be h av iour. Th e fre q ue n c y o f t h e rewa rds i s a l s o i m por t a n t . D uri n g t ra i n ing by rewa rdi ng fo r eve r y c o rre c t be h av iour t h e n g ra dua l ly s w i t c hing to i n te rmit ten t va ri able ra te s . D o n ’ t ph a s e o ut rewa rds to o q ui c kly. H e l p to m a ke t ra i n ing s e ssion e n j oyable a n d to s t re n gt h en t h e h um a n a n i mal bo n d. h t t p: / / w w w. yo ut ube. c om/wat c h ?v= bD Z Cy ObM fk A PUNISHMENT What is punishment? “an aversive action or unpleasant sensation (not necessarily physical) applied either during or within on e second of a particular behaviour that reduces the likelihood of that behaviour being repeated in the future.” Differs from negative reinforcement (where the aversive stimulus is applied before the behavioural response) E.g. Hitting a Zebra a few seconds after they have bitten you. WHY PUNISHMENT MIGHT NOT WORK 1. Pain, fear, anxiety, learned helplessness and stress which are all welfare concerns. 2. Pain, fear, anxiety learned helplessness and stress also interfere with the animals ability to learn and focus. 3. It can intensify the occurrence and severity of behaviour problems. 4. Dif ficulty in getting the timing right causing association with the wrong things. 5. It becomes meaningless (desensitisation). 6. Breakdown in the trainer-animal relationship. SHAPING Specific behaviour may be “shaped” Involves teaching the desired behaviour pattern one step at a time through operant conditioning. The animal needs to be rewarded for behaviour that resembles the eventual behavioural goal. Initially reinforcement is given to an approximation of the behavioural goal. Reinforcement continues as the animals behavioural approximations develop to resemble the final behaviour more closely. Eventually, only the more precise behaviour is rewarded. http://www.youtube.com/watch?v=g6F0bRTurPk&feature=relate d COUNTER CONDITIONING Animal is encouraged to engage in another behaviour that is more pleasurable and which cannot be performed simultaneously with fear responses in the presence of the triggering stimulus. E.g. Feeding a vet phobic Giraf fe whilst earing a vets uniform. FLOODING Prolonged exposure to a negatively perceived stimulus at a level that provokes the response so that the animal eventually gives up. VERY STRESSFUL AND POTENTIALLY DAMAGING E.g. confining a dog in an area and playing the tape at a louder than appropriate level until the dog no longer reacts fearfully Used as a last resort and executed in the most humane way. HOW ANIMAL TRAINERS USE LEARNING THEORY Trainers should use a variety of different learning methods. Animal must be motivated to learn. Minimal use of negative reinforcement because of stressor/& fear – not a good learning state for the animal. Shaping behaviours – helping the animals to learn.