Download Can System Dynamics Models Learn?

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Can System Dynamics Models Learn?
George P. Richardson
Rockefeller College of Public Affairs and Policy
University at Albany - State University of New York
[email protected]
G.P.Richardson, Rockefeller College, University at Albany
1
Can System Dynamics Models
Learn and Adapt?
u
u
u
u
u
Of course not! Models don’t learn; people do.
Learning and adaptation require internal changes,
And system dynamics models have fixed structure
(equations),
So system dynamics models can’t learn.
But how close can they come?
G.P.Richardson, Rockefeller College, University at Albany
2
1
Can System Dynamics Models
Learn and Adapt?
u
System dynamics models can change dominant
structure:
u
u
So the question becomes
u
u
Nonlinearities are our source of endogenous system change
How close can endogenous shifts in loop dominance, generated by
nonlinearities, come to resemble learning?
And the answer is: Pretty close!
G.P.Richardson, Rockefeller College, University at Albany
3
What is Required?
u
u
u
Model structure for the self-perception of model
behavior
Model structure for adaptation and ‘learning’
And perhaps, model structure for endogenously
generated experimentation leading to ‘learning’
G.P.Richardson, Rockefeller College, University at Albany
4
2
An example:
Inventory / Workforce Instabilities
Classic structure of Inventory / Workforce oscillations in
response to randomness in customer orders
Oscillations stem from delays in adjusting the workforce
to changes in the desired production schedule
Two policies dampen the oscillations:
undertime/overtime and re-engineering hiring
u
u
u
G.P.Richardson, Rockefeller College, University at Albany
5
Inventory, Production, & Shipments
+
Inventory
Coverage
-
Inventory
+
+
Production
Rate
+
Order
Fulfillment
Relative
Workweek
+
B
-
Order
Fulfillment
Ratio
+
Maximum
Shipment
Rate
Inventory
Adjustment
Time
-
Table for
Workweek
+
Shipment
Rate
B
Production
Adjustment
from Inventory
-
+
-
+
Table for
Order
Fulfillment
Minimum Order
Processing
Time
+
Desired
Shipment
Rate
Customer
Order Rate
Overtime
Schedule
Pressure
+
Desired +
Inventory
+
Safety
Stock
Coverage
Desired
Inventory
Coverage
+
+
Normal
Production
Desired
Production
+
+
+
+
Expected
Order Rate
Change in
Exp Orders
-
<Labor Force>
<Normal Labor
Productivity>
G.P.Richardson, Rockefeller College, University at Albany
-
+
Time to Average
Order Rate
6
3
Hiring Delays
<Desired Production>
Normal Labor
Productivity
Factory Labor
Request
Authorized
Labor Force
Labor Force
Adjustment
Time
<Pressure to
decentralize hiring
process>
Change in
Auth Labor
Average
Duration of
Employment
Adjustment to
Labor Force
Labor
Authorization
Time
Labor Force
Decentralizaed
labor
authorization time
Hiring Rate
Attrition Rate
Traditional labor
authorization
time
G.P.Richardson, Rockefeller College, University at Albany
7
Persistent oscillations in Inventories
Inventory
60,000
50,000
40,000
30,000
20,000
0
100
200
300
400 500 600
Time (Week)
Inventory : learning3 wo
Desired Inventory : learning3 wo
G.P.Richardson, Rockefeller College, University at Albany
700
800
900
1000
Widgets
Widgets
8
4
…and the Laborforce
Labor and Orders
600 People
20,000 Widgets/Week
500 People
14,000 Widgets/Week
400 People
8,000 Widgets/Week
0
200
400
600
Time (Week)
800
Labor Force : learning3 wo
Customer Order Rate : learning3 wo
1000
People
Widgets/Week
G.P.Richardson, Rockefeller College, University at Albany
9
Learning and forgetting
Inventory
Learning
that
overtime is necessary
60,000
50,000
40,000
30,000
Forgetting
20,000
0
100
200
300
Inventory : learning3
Desired Inventory : learning3
G.P.Richardson, Rockefeller College, University at Albany
400 500 600
Time (Week)
700
800
900
1000
Widgets
Widgets
10
5
Inventory, with and without learning
Inventory comparison
60,000
50,000
40,000
30,000
20,000
0
200
400
600
Time (Week)
800
Inventory : learning3
Inventory : learning3 wo
1000
Widgets
Widgets
G.P.Richardson, Rockefeller College, University at Albany
11
Labor, with and without learning
Labor Force
600
550
500
450
400
0
200
400
600
Time (Week)
Labor Force : learning3
Labor Force : learning3 wo
G.P.Richardson, Rockefeller College, University at Albany
800
1000
People
People
12
6
How does the model ‘learn’?
u
It ‘perceives’ that it, itself, is oscillating,
u
u
u
u
In particular, that the Laborforce is oscillating.
It ‘perceives’ the period and amplitude of its own
oscillations, much as a person would.
It comes to perceive that the amplitude of its oscillations
is above its tolerance,
So ‘pressures’ mount to phase in undertime & overtime
and decentralize hiring
G.P.Richardson, Rockefeller College, University at Albany
13
How does the model ‘forget’?
u
u
u
The model ‘perceives’ that its Laborforce variations are
within its acceptable tolerance,
So the pressures for undertime & overtime and removing
hiring delays subside,
And the damping policies are phased out -- the model
‘forgets’ that it ever had a problem.
G.P.Richardson, Rockefeller College, University at Albany
14
7
The dynamics of pressures to adjust
Pressures rise (and
fall)toto
implement damping policies
Pressure
adjust
1
1
600 People
0.5
0.5
500 People
0
0
400 People
0
100
200
300
400 500 600
Time (Week)
Pressure to decentralize hiring process : learning3
"Pressure to use overtime/undertime" : learning3
Labor Force : learning3
700
800
900
1000
People
G.P.Richardson, Rockefeller College, University at Albany
15
Modeling pressures to adjust
u
u
u
u
u
Perceive peaks and valleys of the oscillations
Estimate period and amplitude
Pressure to adjust = f(Amplitude/Tolerance)
Pressures to adjust are then applied to hiring delays and
the undertime/overtime policy
Relative workweek =
Pressure to use overtime/undertime*Table for Workweek(Schedule Pressure) +
(1-Pressure to use overtime/undertime)*1
u
Labor authorization time =
Pressure to decentralize hiring process*Decentralizaed labor authorization time +
(1-Pressure to decentralize hiring process)*Traditional labor authorization time
G.P.Richardson, Rockefeller College, University at Albany
16
8
Modeling pressures to adjust
Tolerable
amplitude
<Perceived
amplitude>
Pressure to
decentralize f
Pressure to
use OT/UT f
Amplitude ratio
1
1
0.75
0.75
Pressure to
decentralize hiring
process
0.5
0.25
0
0.90
1.35
0.5
Pressure to use
0.25
overtime/undertime
0
0.90
1.80
1.35
1.80
Time to react to
pressure to reduce
amplitude
G.P.Richardson, Rockefeller College, University at Albany
17
Perceiving peaks, valleys, period, and amplitude
Labor Force
600
550
500
450
400
0
200
400
600
Time (Week)
Labor Force : learning3 wo
G.P.Richardson, Rockefeller College, University at Albany
800
1000
People
18
9
The hard part is modeling the perception of peaks
and valleys...
Indicated period
Time of
recent
peak
<Perceived
change in
labor force>
<Labor
Force>
<Past prcvd
chng in labor>
storing time of
recent peak
<TIME
STEP>
<Time>
Time of
recent
valley
Time for past
smoothed labor
storing time of
recent valley
Time to adjust
perception of
oscillations
Past
smoothed
labor force
Smoothed
labor force
Perceived
change in
labor force
Recent peak
in smthd
Labor
storing peak in
labor
Time to
smooth labor
<Time>
Time for past
prcvd chng in
labor
Perceived
period
Perceived
amplitude
Time to chng
perception
adj time
<TIME
STEP>
Past prcvd
chng in
labor
storing valley
in labor
Recent valley
in smthd
Labor
<Past
smoothed
labor force>
G.P.Richardson, Rockefeller College, University at Albany
19
Perception of period and amplitude
Period and amplitude
100
75
50
25
0
0
200
400
600
Time (Week)
800
1000
Perceived amplitude : learning3 wo
Perceived period : learning3 wo
G.P.Richardson, Rockefeller College, University at Albany
20
10
Learning to adjust the workweek policy
Workweek
1.3
1.15
1
0.85
0.7
0
100
200
300
400
500 600
Time (Week)
700
800
Relative Workweek : learning3
Schedule Pressure : learning3
900
1000
Dimensionless
Dimensionless
G.P.Richardson, Rockefeller College, University at Albany
21
Learning to re-engineer the hiring policy
Labor authorization time
4
3
2
1
0
0
200
400
600
Time (Week)
Labor Authorization Time : learning3
Labor Authorization Time : learning3 wo
G.P.Richardson, Rockefeller College, University at Albany
800
1000
Weeks
Weeks
22
11
Long run dynamics
Pressure to adjust
1
1
600 People
Labor force
0.5
0.5
500 People
0
0
400 People
Pressure to adjust
0
300
600
900
1200
Time (Week)
Pressure to decentralize hiring process : long run
"Pressure to use overtime/undertime" : long run
Labor Force : long run
G.P.Richardson, Rockefeller College, University at Albany
1500
1800
People
23
What have we learned?
u
u
u
u
u
If the model didn’t forget, the problem would be solved.
Nonlinearities enable continuous models to adapt and
change over time.
Modeling the model’s perception of its own dynamics is
tricky.
We are thinking of ‘learning’ as purposeful adaptation
in response to system behavior to come closer to goals.
So far, the model can ‘learn’ and ‘forget’ only what we
tell it to.
G.P.Richardson, Rockefeller College, University at Albany
24
12
Can a model explore multiple policies and select
on its own the most advantageous?
u
u
Why not?
In addition to what we’ve seen, the model would
require:
u
u
u
an Exploration sector that sets out the structure for
explorations;
A Selection sector that contains criteria for evaluation of the
model’s own dynamic behavior.
None of that seems impossible, but it could be
daunting...
G.P.Richardson, Rockefeller College, University at Albany
Interpretation
(information
processing)
25
Attention
(scanning, selection,
and measurement)
Judged adequacy of
assessments and
forecasts
Changes in cue selection
and interpretation
Subjectively
interpreted cues
(perceptions)
Mental model
Cognitive model
of system function
(means–ends)
Modification of the
cognitive model
Strategies and tactics
(means)
Modification of
strategies and tactics
Cues
State of
the system
Assessment and
prediction of
system state
Desired state
(ends)
Planned
action
Internal
dynamics
Action
System fuction (reality)
Modification
of goals
Mental activity
Judged adequacy of
planned action
G.P.Richardson, Rockefeller College, University at Albany
26
13
Why build models that learn?
u
u
To achieve real-time adaptive control at the policy level
To compress human learning time
u
u
u
To prove that we can do it
Because we’ve solved all the easier dynamic problems
u
u
Ask models to show us what we can’t learn without them
Not bloody likely!
Because it’s New Year’s Eve and we’re looking for
something to do...
G.P.Richardson, Rockefeller College, University at Albany
27
Further reading
u
u
u
u
u
Self-learning policies in Urban Dynamics, Readings in
Urban Dynamics II (1975).
DeJong, Learning to plan in continuous domains.
Artificial Intelligence 65 (1994).
Ram & Santamaria, Continous case-based reasoning.
Artifical Intelligence 90 (1997)
Richardson, Andersen, Maxwell & Stewart, Foundations
of Mental Model Research (1994)
Powers, Behavior, the Control of Perception (1973)
G.P.Richardson, Rockefeller College, University at Albany
28
14