Download Objective - NUSAP net

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Uncertainty assessment of the
IMAGE/TIMER B1 CO2 emissions
scenario using the NUSAP method
Jeroen P. van der Sluijs1
Jose Potting1
James Risbey1
Detlef van Vuuren2
Bert de Vries2
Arthur Beusen2
Peter Heuberger2
1UU-STS
2RIVM
Serafin Corral Quintana3
Silvio Funtowicz3
Penny Kloprogge1
David Nuijten1
Arthur Petersen5
Jerry Ravetz4
3EC-JRC
4RMC
[email protected]
5VUA
Objective
Develop a framework for uncertainty
assessment and management
addressing quantitative & qualitative
dimensions
test & demonstrate usefulness in IA models
Population
World Economy
(Popher)
(WorldScan)
Change in GDP, population & others
(i.e. scenario assumptions)
Land demand,
use & cover
Energy demand
& supply (TIMER)
Land-use
emissions
Energy & industry
emissions
F
e
e
d
b
a
c
k
s
Emissions & land-use changes
Carbon
cycle
Atmospheric
chemistry
Concentration changes
Climate
(Zonal Climate Model or ECBilt)
Climatic changes
Impacts
Natural
systems
Agricultural
Impacts
Water
Impacts
Land
degradation
Sea level
rise
IMAGE 2:
Framework of
models and
Linkages
TIMER Model : five submodels
Population
(POPHER)
Energy
Demand (ED)
Economy
(WorldScan)
Electricity
demand
Fuel demand
Electric Power
Generation (EPG)
Solid Fuel
supply (SF)
Liquid Fuel
supply (LF)
Gaseous Fuel
supply (GF)
Prices
Population, GDP capita-1, activity in energy
sectors, assumptions regarding technological
development, depletion and others.
Outputs: End-use energy consumption, primary energy
consumption.
Inputs:
Main objectives TIMER
• To analyse the long-term dynamics of the energy system,
and in particular changes in energy demand and the
transition to non-fossil fuels within an integrated
modeling framework;
• To construct and simulate greenhouse gas emission
scenarios that are used in other submodels of IMAGE
2.2 or that are used in meta-models of IMAGE;
Key questions
 What are key uncertainties in TIMER?
 What is the role of model structure
uncertainties in TIMER?
 Uncertainty in which input variables and
parameters dominate uncertainty in
model outcome?
 What is the strength of the sensitive
parameters (pedigree)?
Location of uncertainty
•
•
•
•
•
•
•
•
Input data
Parameters
Technical model structure
Conceptual model sruct. /assumptions
Indicators
Problem framing
System boundary
Socio-political and institutional context
Sorts of uncertainty
• Inexactness
• Unreliability
• Value loading
• Ignorance
Inexactness
•
•
•
•
•
Variability / heterogeneity
Lack of knowledge
Definitional vagueness
Resolution error
Aggregation error
Unreliability
• Limited internal strength in:
–
–
–
–
–
Use of proxies
Empirical basis
Theoretical understanding
Methodological rigour (incl. management of anomalies)
Validation
• Limited external strength in:
–
–
–
–
–
Exploration of rival problem framings
Management of dissent
Extended peer acceptance / stakeholder involvement
Transparency
Accessibility
• Future scope
• Linguistic imprecision
Value loading
• Bias
–
–
–
–
–
–
–
–
In knowledge production
Motivational bias (interests, incentives)
Disciplinary bias
Cultural bias
Choice of (modelling) approach (e.g. bottom up, top down)
Subjective Judgement
In knowledge utilization
Strategic/selective knowledge use
• Disagreement
– about knowledge
– about values
Ignorance
• System indeterminacy
– open endedness
– chaotic behavior
• Active ignorance
– Model fixes for reasons understood
– limited domains of applicability of functional
relations
– Surprise A
• Passive ignorance
– Bugs (numerical / software / hardware error)
– Model fixes for reasons not understood
– Surprise B
Method
 Checklist for model quality assistance
 Meta-level analysis SRES scenarios to
explore model structure uncertainties
 Global sensitivity analysis (Morris)
 NUSAP expert elicitation workshop to assess
pedigree of sensitive model components
 Diagnostic diagram to prioritise uncertainties
by combination of criticality (Morris) and
strength (pedigree)
Checklist
• Assist in quality control in complex
models
• Not models are good or bad but ‘better’
and ‘worse’ forms of modelling practice
• Quality relates to fitness for function
• Help guard against poor practice
• Flag pittfalls
Checklist structure
•
•
•
•
•
•
Screening questions
Model & problem domain
Internal strength
Interface with users
Use in policy process
Overall assessment
Global CO2 emission from fossil fuels
40.00
(SRES scenarios reported to IPCC (2000) by six different modelling groups)
35.00
25.00
20.00
15.00
Maria
Message
Aim
Minicam
ASF
Image
CO2 Emission (in GtC)
30.00
10.00
5.00
0.00
1990
B1-marker
2000
2010
2020
2030
2040
2050
2060
2070
2080
2090
Year
(Van Vuuren et al. 2000)
2100
Morris (1991)
• facilitates global sensitivity analysis in
minimum number of model runs
• covers entire range of possible
values for each variable
• parameters varied one step at a time in
such a way that if sensitivity of one
parameter is contingent on the values
that other parameters may take, Morris
captures such dependencies
Most sensitive model components:
• Population levels and economic activity
• Intra-sectoral structural change
• Progress ratios for technological
improvements
• Size and cost supply curves of fossil fuels
resources
• Autonomous and price-induced energy
efficiency improvement
• Initial costs and depletion of renewables
NUSAP: Pedigree
Evaluates the strength of the number by
looking at:
• Background history by which the
number was produced
• Underpinning and scientific status of the
number
Parameter Pedigree
•
•
•
•
•
Proxy
Empirical basis
Theoretical understanding
Methodological rigour
Validation
Proxy
Sometimes it is not possible to obtain direct
measurements or estimates of the parameter and so
some form of proxy measure is used.
Proxy refers to how good or close a measure of the
quantity which we model is to the actual quantity we
represent. An exact measure of the quantity would
score four. If the measured quantity is not clearly
related to the desired quantity the score would be
zero.
Empirical basis
Empirical quality typically refers to the degree to which
direct observations are used to estimate the
parameter.
When the parameter is based upon good quality
observational data, the pedigree score will be high.
Sometimes directly observed data are not available
and the parameter is estimated based on partial
measurements or calculated from other quantities.
Parameters determined by such indirect methods
have a weaker empirical basis and will generally
score lower than those based on direct observations.
Theoretical understanding
The parameter will have some basis in
theoretical understanding of the phenomenon
it represents. This criterion refers to the
extent en partiality of the theoretical
understanding.
Parameters based on well established theory
will score high on this metric, while
parameters whose theoretical basis has the
status of crude speculation will score low.
Methodological rigour
Some method will be used to collect, check,
and revise the data used for making
parameter estimates. Methodological quality
refers to the norms for methodological rigour
in this process applied by peers in the
relevant disciplines.
Well established and respected methods for
measuring and processing the data would
score high on this metric, while untested or
unreliable methods would tend to score
lower.
Validation
This metric refers to the degree to which one has been able
to cross-check the data against independent sources.
When the parameter has been compared with appropriate
sets of independent data to assess its reliability it will
score high on this metric. In many cases, independent
data for the same parameter over the same time period
are not available and other datasets must be used for
validation. This may require a compromise in the length
or overlap of the datasets, or may require use of a related,
but different, proxy variable, or perhaps use of data that
has been aggregated on different scales. The more
indirect or incomplete the validation, the lower it will score
on this metric.
Code Proxy
Empirical
Theoretical basis Method
Validation
4
Exact
measure
Large sample
direct mmts
Well established Best available
theory
practice
3
Good fit or Small sample
measure
direct mmts
2
Well
correlated
Compared with
indep. mmts of
same variable
Compared with
indep. mmts of
closely related
variable
Compared with
mmts not
independent
1
Weak
correlation
0
Not clearly Crude
related
speculation
Accepted theory
partial in nature
Modeled/derived Partial theory
data
limited
consensus on
reliability
Educated guesses Preliminary
/ rule of thumb
theory
est
Crude
speculation
Reliable method
commonly
accepted
Acceptable
method limited
consensus on
reliability
Preliminary
methods
unknown
reliability
No discernible
rigour
Weak / indirect
validation
No validation
Elicitation workshop
• 18 experts (in 3 parallel groups of 6)
discussed parameters, one by one,
using information & scoring cards
• Individual expert judgements, informed
by group discussion
Example result gas depletion multiplier
Radar diagram:
Each coloured line represents scores
given by one expert
Same data represented as kite diagram:
Green = min. scores, Amber= max scores,
Light green = min. scores if outliers omitted
(Traffic light analogy)
Average scores (0-4)
•
•
•
•
•
proxy
empirical
theory
method
validation
• valueladeness
• competence
2½
2
2
2
1
±½
±½
±½
±½
±½
2½
2
±1
±½
Conclusions (1)
 Model quality assurance checklist
proves quick scan to flag major areas of
concern and associated pitfalls in the
complex mass uncertainties.
 Meta-level intercomparison of TIMER
with the other SRES models gave us
some insight in the potential roles of
model structure uncertainties.
Conclusions (2)
 Global sensitivity analysis
supplemented with expert elicitation
constitutes an efficient selection
mechanism to further focus the
diagnosis of key uncertainties.
 Our pedigree elicitation procedure
yields a differentiated insight into
parameter strength.
Conclusions (3)
 The diagnostic diagram puts spread and
strength together to provide guidance in
prioritisation of key uncertainties.
Conclusions (4)
NUSAP method:
• can be applied to complex models in a
meaningful way
• helps to focus research efforts on the
potentially most problematic model
components
• pinpoints specific weaknesses in these
components
More information:
www.nusap.net