Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Prof. Stephan Anagnostaras Lecture 2: Learning Theory Classical (Pavlovian) conditioning Classical (Pavlovian) conditioning Neurobiology of Learning and Memory Twitmyer (1902) Paired bell with patellar tendon tap • Previously neutral bell could now elicit knee jerk Ivan Pavlov Studied digestion, and noticed that after he worked with a particular dog for a while, the dog salivated when it first saw him. Paired metronome with food • Previously neutral metronome elicited salivation. • Called this conditioning Classical (Pavlovian) conditioning A conditional relationship emerged between the meaningful and previously neutral stimulus. After pairing, how do you know you have a CR? 1) Present the CS alone (without the US) US - unconditional stimulus - biologically significant stimulus (food) UR - unconditional response (salivation) CS - conditional stimulus - previously neutral stimulus (bell) CR - conditional response (salivation) 2) Measure the response at the beginning of the CS (metronome) before the US is presented (food) • One theory is that the purpose of CSs is to predict USs and the CR is a prepatory response. The UR and CR can be different, but it bears some relationship to the UR. Basic phenomena Basic Phenomena Asymptote •Negatively accelerating growth curve • The stronger the US, the stronger the CR (same growth rate) Growth rate 1.Acquisition from CS–US pairings • the curve is negatively accelerating • the stronger US produces a higher asymptote • the CR gets stronger with repeated trials 2. Extinction • the CS is presented alone after conditioning• CS– • same curve as acquisition • not unlearning or erasing memory 3. Generalization • if you present a similar CS you will get a similar reaction • generalization decrement 1 Basic phenomena Basic phenomena 4. Discrimination • Train CS+ and CS– that are similar • Inhibition • Inhibition is a weaker process than excitation • Spontaneous recovery in extinction • Disinhibition in extinction Associative learning theory • Tries to explain what is going on and relies on 3 processes to explain everything 1. Excitation (excitatory association) 2. Inhibition (inhibitory association) 3. Generalization • Excitatory association not lost, it’s only the buildup of inhibition that suppresses excitation Procedure, Process, & Behavior # of things explained ---------------------------# of explanatory principles Control procedures • Discrimination explained using learning theory • Extinction explained • Law of parsimony Power of a theory = Procedure = what we do (e.g., pair CS and US) Process = what intervenes between procedure and behavior (e.g., excitation, inhibition) Behavioral result what we observe (e.g., after extinction we see a reduction of the CR) In order to study associative learning, must show change in behavior is due to pairing of the CS and US • Presentation of stimulus alone increases CR: Sensitization Control: present the US alone • Our explanation involves all three • Must be aware of this distinction -- procedure is not what is learned by the animal • Presentation of CS alone increases CR Pseudoconditioning Control: present the CS alone • Skinner argued only talk about procedure-result laws (radical behaviorism) Several acquisition procedures Control procedures How could we combine the two control groups? Unpaired group receives both the US (sensitization) and CS (pseudoconditioning) but not together. Alternative is the truly random control. The main point is subject has same experience with CS and US as the Conditioning group. Forward works best. Interestingly this is a test of Contiguity Theory 2 Several acquisition procedures Higher order conditioning Delay conditioning is another term for forward conditioning. Second-order conditioning Trace conditioning is quite special in terms of mechanistic models of animal learning. Phase I Phase II Test CS1-US CS2-CS1 CS2-->CR tone-food light-tone light Sensory pre-conditioning Generality of conditioning Phase I Phase II Test CS2-CS1 CS1-US CS2-->CR light-tone tone-food light Generality of conditioning Coke (CS)-----> Sugar US----> UR (insulin release) …after a few pairings… Coke (CS) ---> CR (insuline release) • Abrupt switch to Diet Coke can cause hypoglycemia • Pavlovian conditioning prepares the body for impending URs Generality of conditioning Hollis (1989) Exp 2: Paired: light (CS) paired with access to females (US) Unpaired: light unpaired with access to females Testing: get light then access to female Result: when light turned on paired group started mating much more rapidly than unpaired. Hollis (1989) blue gouramis mating behavior - if a male enters territory drives it away Exp 1: • Males were subjects • Training: Paired: light (CS) paired with access to males (US) Unpaired: light unpaired with access to males Testing: the light was turned on and barrier removed. Paired male always won against unpaired male. But also drives away female. Generality of conditioning Conditioning permeates everything you do can condition pancreas and most glands, voluntary and involuntary muscles, and immune system Hollis (1997) Exp 4: Reproductive success Training: Paired got light with access to female for 2h, Unpaired got light unpaired with access. Testing: present light then give access to female for 2 h for both groups. Six days later count baby gouramis Exp 3: Design the same as #2, except female now in between paired and unpaired male -- female always picks paired male 3 xWhat is learned? What is learned? Emotional Learning Why not just measure fear? • Little Albert study Conditioned emotional response (CER) (Pavlovian fear conditioning) • No attention to evolution. Why do rats stop barpressing? They freeze. Nowadays people just measure freezing or other defensive CR. Estes & Skinner (1941) Conditioned Suppression Trained to bar-press for food Paired tone with shock When tone came on fear suppressed bar-pressing E.g. Fanselow & Bolles 1979: Did fear conditioning with backward (unpaired group) • Evolution heavily influences what is learned, and even what can be learned Suppression became the dominant way to measure CR What is learned? What is learned? S-S vs S-R S-S vs S-R Two views on learning Rescorla (1974) Inflation experiment 1. Tone-shock (0.5 mA) 2. US alone groups: - 3 mA - 1 mA - 0.5 mA - no shock 3. Test CS alone - little devaluation in 0.5 mA group - massive inflation in 1 and 3 mA groups - Memory of the shock changed and CR changed Strong evidence for S-S learning: Rescorla (1973): Devaluation Experiment •Conditioned Suppression 1. Light (CS) paired with loud noise (US) 2. US alone - habituate (control = no habituation) 3. Test to CS - habituation group much less fear What is learned? What causes conditioning? Typical CTA Procedure S-S: CS--->US---> R S-R: CS--->R (US serves to stamp in this association) (taste) (illness inducing agent) CS US Contiguity theory: things have to occur together, that is necessary and sufficient -Contiguity not necessary Challenges: • Simultaneous conditioning doesn’t work well • Garcia & Koelling (1966) Conditioned Taste Aversion (CTA) Good conditioning with CS-US delay of up to 75 min CR (disgust) UR (illness) Avoidance 4 What is learned? What is learned? Is contiguity sufficient? Is contiguity sufficient? Kamin (1968): Blocking effect A= CS + = US AB+ = two different CSs with US Un Blocking effect A= CS + = US Train AB+ light-tone-shock light However… Phase I Phase II A+ AB+ Test B alone = good conditioning Test B alone = no conditioning!! AB+ = two different CSs with US Phase I Phase II A+ AB++ Test B alone = conditioning!! Big US was SURPRISING. US must be SURPRISING. Note that contiguity is the same in both experiments What is learned? Relationship between cue and consequence It is also surprising if you don’t get the US: Conditioned inhibition procedure: Phase I Phase II Test A+ AB– B = cond inhibitor • Garcia & Koelling (1966) “Bright Noisy Water Experiment” US was expected but didn’t occur! • taste associated with illness • audio/visual stimuli associated with shock Garcia & Koelling (1966) Garcia & Koelling (1966) Lithium Chloride Lithium Chloride Salty water A/V Taste Licks Light Noise Shock Taste A/V 5 Garcia & Koelling Modern learning theory Biological constraints on learning Wagner, Logan, Haberlandt & Price (1968) Relative validity Theory Two cmpd CSs: AX (tone, light),BX (buzzer, light) Animal sometimes get AX, sometimes BX In group 1 (correlated conditioning group): AX is reinforced 100% (AX+) and BX is never reinforced (BX-) In group 2 (the uncorrelated group): AX is reinforced 50% of the time, and BX is reinforced 50% of the time. Modern learning theory Modern learning theory In training… Correlated group AX = 100% reinf BX = 0% Uncorrelated Grp AX = 50% BX = 50% A predicts US B predicts no US neither A or B perfectly predicts US In test phase: Correlated gp Uncorrelated gp A alone Strong cond No cond B alone No Cond No cond X alone No cond Strong cond X has the same number of pairings in both groups, so contiguity theory is screwed Modern learning theory • Wagner says the cue must be the most valid predictor of the US in the situation in order to get associated. Relative validity to other CSs. Modern learning theory Both get 50% reinforcement overall. But what is happening to X? X is reinforced 50% of the time in both groups. According to contiguity theory should have the same conditioning. What happens? Correlated group: A perfectly predicts shock, and X Rescorla (1968), Contingency experiment only half the time predicts shock CS = tone, US = shock For all groups, P (US|CS) = 0.8 (80% of the time you get Uncorrelated group: A predicts shock half the time the CS you will get the US also). when its on, the same for B. But X predicts shock half the time whether A or B are on or not. So X is the most Rescorla varied the P(US|no CS) for all groups. valid cue in this situation. 6 Modern learning theory Modern learning theory Rescorla called this contingency theory: If P(US|CS) > P (US|no CS) then excitatory conditioning If P(US|CS) < P(US|no CS) then inhibitory conditioning (e.g., safety signal) If P(US|CS) = P(US|no CS) then no conditionin occurs (truly random control) Rescorla-Wagner Model (1972) Key Assumptions: 1. Emphasize CS-US pairings as criticial for conditioning 2. Formalize the notion of Kamin’s suprirse 3. Assume that any US can only support a limited amount of conditioning/reinforcement 4. All the CSs compete with echother for the limited amount of conditioning/reinforcement 5. Competition occurs through summation of all the CSs present on a given trial •The US has a certain amount it can condition, meaning this is a US-limiting model. •Stimuli compete for ability to predict the US. Rescorla-Wagner Model (1972) Rescorla-Wagner Model (1972) Can explain a number of phenomena: Acquisition, extinction Blocking (A+, AB+, … B) Unblocking (A+, AB++, …B) Conditioned Inhibition (A+, AB–, … B) Contingency • Can deal with a number of phenomena and makes several new predictions which were testable • Cannot deal with latent inhibition (CS pre-exposure) • Can deal with US pre-exposure effect 7