Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Can System Dynamics Models Learn? George P. Richardson Rockefeller College of Public Affairs and Policy University at Albany - State University of New York [email protected] G.P.Richardson, Rockefeller College, University at Albany 1 Can System Dynamics Models Learn and Adapt? u u u u u Of course not! Models don’t learn; people do. Learning and adaptation require internal changes, And system dynamics models have fixed structure (equations), So system dynamics models can’t learn. But how close can they come? G.P.Richardson, Rockefeller College, University at Albany 2 1 Can System Dynamics Models Learn and Adapt? u System dynamics models can change dominant structure: u u So the question becomes u u Nonlinearities are our source of endogenous system change How close can endogenous shifts in loop dominance, generated by nonlinearities, come to resemble learning? And the answer is: Pretty close! G.P.Richardson, Rockefeller College, University at Albany 3 What is Required? u u u Model structure for the self-perception of model behavior Model structure for adaptation and ‘learning’ And perhaps, model structure for endogenously generated experimentation leading to ‘learning’ G.P.Richardson, Rockefeller College, University at Albany 4 2 An example: Inventory / Workforce Instabilities Classic structure of Inventory / Workforce oscillations in response to randomness in customer orders Oscillations stem from delays in adjusting the workforce to changes in the desired production schedule Two policies dampen the oscillations: undertime/overtime and re-engineering hiring u u u G.P.Richardson, Rockefeller College, University at Albany 5 Inventory, Production, & Shipments + Inventory Coverage - Inventory + + Production Rate + Order Fulfillment Relative Workweek + B - Order Fulfillment Ratio + Maximum Shipment Rate Inventory Adjustment Time - Table for Workweek + Shipment Rate B Production Adjustment from Inventory - + - + Table for Order Fulfillment Minimum Order Processing Time + Desired Shipment Rate Customer Order Rate Overtime Schedule Pressure + Desired + Inventory + Safety Stock Coverage Desired Inventory Coverage + + Normal Production Desired Production + + + + Expected Order Rate Change in Exp Orders - <Labor Force> <Normal Labor Productivity> G.P.Richardson, Rockefeller College, University at Albany - + Time to Average Order Rate 6 3 Hiring Delays <Desired Production> Normal Labor Productivity Factory Labor Request Authorized Labor Force Labor Force Adjustment Time <Pressure to decentralize hiring process> Change in Auth Labor Average Duration of Employment Adjustment to Labor Force Labor Authorization Time Labor Force Decentralizaed labor authorization time Hiring Rate Attrition Rate Traditional labor authorization time G.P.Richardson, Rockefeller College, University at Albany 7 Persistent oscillations in Inventories Inventory 60,000 50,000 40,000 30,000 20,000 0 100 200 300 400 500 600 Time (Week) Inventory : learning3 wo Desired Inventory : learning3 wo G.P.Richardson, Rockefeller College, University at Albany 700 800 900 1000 Widgets Widgets 8 4 …and the Laborforce Labor and Orders 600 People 20,000 Widgets/Week 500 People 14,000 Widgets/Week 400 People 8,000 Widgets/Week 0 200 400 600 Time (Week) 800 Labor Force : learning3 wo Customer Order Rate : learning3 wo 1000 People Widgets/Week G.P.Richardson, Rockefeller College, University at Albany 9 Learning and forgetting Inventory Learning that overtime is necessary 60,000 50,000 40,000 30,000 Forgetting 20,000 0 100 200 300 Inventory : learning3 Desired Inventory : learning3 G.P.Richardson, Rockefeller College, University at Albany 400 500 600 Time (Week) 700 800 900 1000 Widgets Widgets 10 5 Inventory, with and without learning Inventory comparison 60,000 50,000 40,000 30,000 20,000 0 200 400 600 Time (Week) 800 Inventory : learning3 Inventory : learning3 wo 1000 Widgets Widgets G.P.Richardson, Rockefeller College, University at Albany 11 Labor, with and without learning Labor Force 600 550 500 450 400 0 200 400 600 Time (Week) Labor Force : learning3 Labor Force : learning3 wo G.P.Richardson, Rockefeller College, University at Albany 800 1000 People People 12 6 How does the model ‘learn’? u It ‘perceives’ that it, itself, is oscillating, u u u u In particular, that the Laborforce is oscillating. It ‘perceives’ the period and amplitude of its own oscillations, much as a person would. It comes to perceive that the amplitude of its oscillations is above its tolerance, So ‘pressures’ mount to phase in undertime & overtime and decentralize hiring G.P.Richardson, Rockefeller College, University at Albany 13 How does the model ‘forget’? u u u The model ‘perceives’ that its Laborforce variations are within its acceptable tolerance, So the pressures for undertime & overtime and removing hiring delays subside, And the damping policies are phased out -- the model ‘forgets’ that it ever had a problem. G.P.Richardson, Rockefeller College, University at Albany 14 7 The dynamics of pressures to adjust Pressures rise (and fall)toto implement damping policies Pressure adjust 1 1 600 People 0.5 0.5 500 People 0 0 400 People 0 100 200 300 400 500 600 Time (Week) Pressure to decentralize hiring process : learning3 "Pressure to use overtime/undertime" : learning3 Labor Force : learning3 700 800 900 1000 People G.P.Richardson, Rockefeller College, University at Albany 15 Modeling pressures to adjust u u u u u Perceive peaks and valleys of the oscillations Estimate period and amplitude Pressure to adjust = f(Amplitude/Tolerance) Pressures to adjust are then applied to hiring delays and the undertime/overtime policy Relative workweek = Pressure to use overtime/undertime*Table for Workweek(Schedule Pressure) + (1-Pressure to use overtime/undertime)*1 u Labor authorization time = Pressure to decentralize hiring process*Decentralizaed labor authorization time + (1-Pressure to decentralize hiring process)*Traditional labor authorization time G.P.Richardson, Rockefeller College, University at Albany 16 8 Modeling pressures to adjust Tolerable amplitude <Perceived amplitude> Pressure to decentralize f Pressure to use OT/UT f Amplitude ratio 1 1 0.75 0.75 Pressure to decentralize hiring process 0.5 0.25 0 0.90 1.35 0.5 Pressure to use 0.25 overtime/undertime 0 0.90 1.80 1.35 1.80 Time to react to pressure to reduce amplitude G.P.Richardson, Rockefeller College, University at Albany 17 Perceiving peaks, valleys, period, and amplitude Labor Force 600 550 500 450 400 0 200 400 600 Time (Week) Labor Force : learning3 wo G.P.Richardson, Rockefeller College, University at Albany 800 1000 People 18 9 The hard part is modeling the perception of peaks and valleys... Indicated period Time of recent peak <Perceived change in labor force> <Labor Force> <Past prcvd chng in labor> storing time of recent peak <TIME STEP> <Time> Time of recent valley Time for past smoothed labor storing time of recent valley Time to adjust perception of oscillations Past smoothed labor force Smoothed labor force Perceived change in labor force Recent peak in smthd Labor storing peak in labor Time to smooth labor <Time> Time for past prcvd chng in labor Perceived period Perceived amplitude Time to chng perception adj time <TIME STEP> Past prcvd chng in labor storing valley in labor Recent valley in smthd Labor <Past smoothed labor force> G.P.Richardson, Rockefeller College, University at Albany 19 Perception of period and amplitude Period and amplitude 100 75 50 25 0 0 200 400 600 Time (Week) 800 1000 Perceived amplitude : learning3 wo Perceived period : learning3 wo G.P.Richardson, Rockefeller College, University at Albany 20 10 Learning to adjust the workweek policy Workweek 1.3 1.15 1 0.85 0.7 0 100 200 300 400 500 600 Time (Week) 700 800 Relative Workweek : learning3 Schedule Pressure : learning3 900 1000 Dimensionless Dimensionless G.P.Richardson, Rockefeller College, University at Albany 21 Learning to re-engineer the hiring policy Labor authorization time 4 3 2 1 0 0 200 400 600 Time (Week) Labor Authorization Time : learning3 Labor Authorization Time : learning3 wo G.P.Richardson, Rockefeller College, University at Albany 800 1000 Weeks Weeks 22 11 Long run dynamics Pressure to adjust 1 1 600 People Labor force 0.5 0.5 500 People 0 0 400 People Pressure to adjust 0 300 600 900 1200 Time (Week) Pressure to decentralize hiring process : long run "Pressure to use overtime/undertime" : long run Labor Force : long run G.P.Richardson, Rockefeller College, University at Albany 1500 1800 People 23 What have we learned? u u u u u If the model didn’t forget, the problem would be solved. Nonlinearities enable continuous models to adapt and change over time. Modeling the model’s perception of its own dynamics is tricky. We are thinking of ‘learning’ as purposeful adaptation in response to system behavior to come closer to goals. So far, the model can ‘learn’ and ‘forget’ only what we tell it to. G.P.Richardson, Rockefeller College, University at Albany 24 12 Can a model explore multiple policies and select on its own the most advantageous? u u Why not? In addition to what we’ve seen, the model would require: u u u an Exploration sector that sets out the structure for explorations; A Selection sector that contains criteria for evaluation of the model’s own dynamic behavior. None of that seems impossible, but it could be daunting... G.P.Richardson, Rockefeller College, University at Albany Interpretation (information processing) 25 Attention (scanning, selection, and measurement) Judged adequacy of assessments and forecasts Changes in cue selection and interpretation Subjectively interpreted cues (perceptions) Mental model Cognitive model of system function (means–ends) Modification of the cognitive model Strategies and tactics (means) Modification of strategies and tactics Cues State of the system Assessment and prediction of system state Desired state (ends) Planned action Internal dynamics Action System fuction (reality) Modification of goals Mental activity Judged adequacy of planned action G.P.Richardson, Rockefeller College, University at Albany 26 13 Why build models that learn? u u To achieve real-time adaptive control at the policy level To compress human learning time u u u To prove that we can do it Because we’ve solved all the easier dynamic problems u u Ask models to show us what we can’t learn without them Not bloody likely! Because it’s New Year’s Eve and we’re looking for something to do... G.P.Richardson, Rockefeller College, University at Albany 27 Further reading u u u u u Self-learning policies in Urban Dynamics, Readings in Urban Dynamics II (1975). DeJong, Learning to plan in continuous domains. Artificial Intelligence 65 (1994). Ram & Santamaria, Continous case-based reasoning. Artifical Intelligence 90 (1997) Richardson, Andersen, Maxwell & Stewart, Foundations of Mental Model Research (1994) Powers, Behavior, the Control of Perception (1973) G.P.Richardson, Rockefeller College, University at Albany 28 14