Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Uncertainty assessment of the IMAGE/TIMER B1 CO2 emissions scenario using the NUSAP method Jeroen P. van der Sluijs1 Jose Potting1 James Risbey1 Detlef van Vuuren2 Bert de Vries2 Arthur Beusen2 Peter Heuberger2 1UU-STS 2RIVM Serafin Corral Quintana3 Silvio Funtowicz3 Penny Kloprogge1 David Nuijten1 Arthur Petersen5 Jerry Ravetz4 3EC-JRC 4RMC [email protected] 5VUA Objective Develop a framework for uncertainty assessment and management addressing quantitative & qualitative dimensions test & demonstrate usefulness in IA models Population World Economy (Popher) (WorldScan) Change in GDP, population & others (i.e. scenario assumptions) Land demand, use & cover Energy demand & supply (TIMER) Land-use emissions Energy & industry emissions F e e d b a c k s Emissions & land-use changes Carbon cycle Atmospheric chemistry Concentration changes Climate (Zonal Climate Model or ECBilt) Climatic changes Impacts Natural systems Agricultural Impacts Water Impacts Land degradation Sea level rise IMAGE 2: Framework of models and Linkages TIMER Model : five submodels Population (POPHER) Energy Demand (ED) Economy (WorldScan) Electricity demand Fuel demand Electric Power Generation (EPG) Solid Fuel supply (SF) Liquid Fuel supply (LF) Gaseous Fuel supply (GF) Prices Population, GDP capita-1, activity in energy sectors, assumptions regarding technological development, depletion and others. Outputs: End-use energy consumption, primary energy consumption. Inputs: Main objectives TIMER • To analyse the long-term dynamics of the energy system, and in particular changes in energy demand and the transition to non-fossil fuels within an integrated modeling framework; • To construct and simulate greenhouse gas emission scenarios that are used in other submodels of IMAGE 2.2 or that are used in meta-models of IMAGE; Key questions What are key uncertainties in TIMER? What is the role of model structure uncertainties in TIMER? Uncertainty in which input variables and parameters dominate uncertainty in model outcome? What is the strength of the sensitive parameters (pedigree)? Location of uncertainty • • • • • • • • Input data Parameters Technical model structure Conceptual model sruct. /assumptions Indicators Problem framing System boundary Socio-political and institutional context Sorts of uncertainty • Inexactness • Unreliability • Value loading • Ignorance Inexactness • • • • • Variability / heterogeneity Lack of knowledge Definitional vagueness Resolution error Aggregation error Unreliability • Limited internal strength in: – – – – – Use of proxies Empirical basis Theoretical understanding Methodological rigour (incl. management of anomalies) Validation • Limited external strength in: – – – – – Exploration of rival problem framings Management of dissent Extended peer acceptance / stakeholder involvement Transparency Accessibility • Future scope • Linguistic imprecision Value loading • Bias – – – – – – – – In knowledge production Motivational bias (interests, incentives) Disciplinary bias Cultural bias Choice of (modelling) approach (e.g. bottom up, top down) Subjective Judgement In knowledge utilization Strategic/selective knowledge use • Disagreement – about knowledge – about values Ignorance • System indeterminacy – open endedness – chaotic behavior • Active ignorance – Model fixes for reasons understood – limited domains of applicability of functional relations – Surprise A • Passive ignorance – Bugs (numerical / software / hardware error) – Model fixes for reasons not understood – Surprise B Method Checklist for model quality assistance Meta-level analysis SRES scenarios to explore model structure uncertainties Global sensitivity analysis (Morris) NUSAP expert elicitation workshop to assess pedigree of sensitive model components Diagnostic diagram to prioritise uncertainties by combination of criticality (Morris) and strength (pedigree) Checklist • Assist in quality control in complex models • Not models are good or bad but ‘better’ and ‘worse’ forms of modelling practice • Quality relates to fitness for function • Help guard against poor practice • Flag pittfalls Checklist structure • • • • • • Screening questions Model & problem domain Internal strength Interface with users Use in policy process Overall assessment Global CO2 emission from fossil fuels 40.00 (SRES scenarios reported to IPCC (2000) by six different modelling groups) 35.00 25.00 20.00 15.00 Maria Message Aim Minicam ASF Image CO2 Emission (in GtC) 30.00 10.00 5.00 0.00 1990 B1-marker 2000 2010 2020 2030 2040 2050 2060 2070 2080 2090 Year (Van Vuuren et al. 2000) 2100 Morris (1991) • facilitates global sensitivity analysis in minimum number of model runs • covers entire range of possible values for each variable • parameters varied one step at a time in such a way that if sensitivity of one parameter is contingent on the values that other parameters may take, Morris captures such dependencies Most sensitive model components: • Population levels and economic activity • Intra-sectoral structural change • Progress ratios for technological improvements • Size and cost supply curves of fossil fuels resources • Autonomous and price-induced energy efficiency improvement • Initial costs and depletion of renewables NUSAP: Pedigree Evaluates the strength of the number by looking at: • Background history by which the number was produced • Underpinning and scientific status of the number Parameter Pedigree • • • • • Proxy Empirical basis Theoretical understanding Methodological rigour Validation Proxy Sometimes it is not possible to obtain direct measurements or estimates of the parameter and so some form of proxy measure is used. Proxy refers to how good or close a measure of the quantity which we model is to the actual quantity we represent. An exact measure of the quantity would score four. If the measured quantity is not clearly related to the desired quantity the score would be zero. Empirical basis Empirical quality typically refers to the degree to which direct observations are used to estimate the parameter. When the parameter is based upon good quality observational data, the pedigree score will be high. Sometimes directly observed data are not available and the parameter is estimated based on partial measurements or calculated from other quantities. Parameters determined by such indirect methods have a weaker empirical basis and will generally score lower than those based on direct observations. Theoretical understanding The parameter will have some basis in theoretical understanding of the phenomenon it represents. This criterion refers to the extent en partiality of the theoretical understanding. Parameters based on well established theory will score high on this metric, while parameters whose theoretical basis has the status of crude speculation will score low. Methodological rigour Some method will be used to collect, check, and revise the data used for making parameter estimates. Methodological quality refers to the norms for methodological rigour in this process applied by peers in the relevant disciplines. Well established and respected methods for measuring and processing the data would score high on this metric, while untested or unreliable methods would tend to score lower. Validation This metric refers to the degree to which one has been able to cross-check the data against independent sources. When the parameter has been compared with appropriate sets of independent data to assess its reliability it will score high on this metric. In many cases, independent data for the same parameter over the same time period are not available and other datasets must be used for validation. This may require a compromise in the length or overlap of the datasets, or may require use of a related, but different, proxy variable, or perhaps use of data that has been aggregated on different scales. The more indirect or incomplete the validation, the lower it will score on this metric. Code Proxy Empirical Theoretical basis Method Validation 4 Exact measure Large sample direct mmts Well established Best available theory practice 3 Good fit or Small sample measure direct mmts 2 Well correlated Compared with indep. mmts of same variable Compared with indep. mmts of closely related variable Compared with mmts not independent 1 Weak correlation 0 Not clearly Crude related speculation Accepted theory partial in nature Modeled/derived Partial theory data limited consensus on reliability Educated guesses Preliminary / rule of thumb theory est Crude speculation Reliable method commonly accepted Acceptable method limited consensus on reliability Preliminary methods unknown reliability No discernible rigour Weak / indirect validation No validation Elicitation workshop • 18 experts (in 3 parallel groups of 6) discussed parameters, one by one, using information & scoring cards • Individual expert judgements, informed by group discussion Example result gas depletion multiplier Radar diagram: Each coloured line represents scores given by one expert Same data represented as kite diagram: Green = min. scores, Amber= max scores, Light green = min. scores if outliers omitted (Traffic light analogy) Average scores (0-4) • • • • • proxy empirical theory method validation • valueladeness • competence 2½ 2 2 2 1 ±½ ±½ ±½ ±½ ±½ 2½ 2 ±1 ±½ Conclusions (1) Model quality assurance checklist proves quick scan to flag major areas of concern and associated pitfalls in the complex mass uncertainties. Meta-level intercomparison of TIMER with the other SRES models gave us some insight in the potential roles of model structure uncertainties. Conclusions (2) Global sensitivity analysis supplemented with expert elicitation constitutes an efficient selection mechanism to further focus the diagnosis of key uncertainties. Our pedigree elicitation procedure yields a differentiated insight into parameter strength. Conclusions (3) The diagnostic diagram puts spread and strength together to provide guidance in prioritisation of key uncertainties. Conclusions (4) NUSAP method: • can be applied to complex models in a meaningful way • helps to focus research efforts on the potentially most problematic model components • pinpoints specific weaknesses in these components More information: www.nusap.net