Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Preserving Cloud Information Bruce R. Barkstrom & John J. Bates NCDC Outline ► Fundamental Preservation Commandments ► Questions Variability Quantification Error Analysis and Physics ► Costs ► What Can We Do Now? Four Commandments for Preserving Information 1. 2. 3. 4. Thou shalt not be forced to preserve information before it is ready Thou shalt not lose information – if possible Thou shalt not cost more than necessary Thou must make data accessible and valuable To current users To future users When is Data Ready for Preservation? ► When we have a good model of the underlying “natural variability” and “expected climate change” of the fields being measured Not just mean and standard deviation – current applications need description of extreme events Need regional time variations ► When we have a physical basis for estimating errors and their impact on climate change detectability Need more than just measurement statistics Must include probability distribution of possible biases Quantification of Field Variability ► The “variability Turing test”: Can you generate an ensemble of computer generated fields with statistics that is indistinguishable from those of the real field? ► The “climate Turing test”: Can you generate a model of “trends” whose statistics are indistinguishable from those of the expected climate changes? Current State ► Measurement “Requirements” for Climate usually stated as global values of means and standard deviations ► Corresponding statistics can be generated by appropriate white noise ► Is this adequate? Probably not – clouds variations are more complex than a global mean and simple latitudinal variations ► Can we come up with a common basis for stating variability across Earth science? Regional? Regional with moving systems? No Preservation Without Understandable Error Assessments ► Error assessments for climate data records are difficult Need physical basis for estimating uncertainties, not just internally consistent measurement statistics Error assessments must be tied to algorithm code – data editing is as important as coefficients or outlines of algorithms Errors are not believable if entire data production process is not publicly understandable Current State ► Algorithm Theoretical Basis Documents do not necessarily represent the “as-built” algorithms with their data editing ► EOS data production systems are “overwhelmingly complex” May need new documentation tools to provide understanding – 100,000 lines of code is not readable in a Sunday afternoon ► As Science Teams disperse, community knowledge will be lost unless we take steps to prevent it May need to develop “data scholars” Action Items 1. 2. Can this workshop produce an understandable, quantitative description of cloud variability – and of expected cloud property changes? Is it possible to develop a communityaccepted standard checklist of errors for cloud properties? Sample Error Checklist ► Are the “as-built” instrument drawings available? ► Is the ground calibration data available? ► Is there a computational math model of the instrument that includes all of the physics of the measurement? ► How was the gain determined? ► How was the spectral response determined? ► How was the Point Spread Function measured? ►… Models for Preservation Funding ► The Cemetery Model: Pay when the body is deposited; live off the interest ► The Advanced Cemetery Model: Pay for the previous bodies, as well as the one you’re depositing; make sure to add new bodies (the Cemetery as Pyramid) ► The Cemetery as Theme Park: Make the cemetery interesting to visit; charge admission ► The Public Broadcasting Approach: Beg for support annually – and ask for volunteers Actions That Can Reduce Preservation Costs and Risk ► Arrange a “Submission Agreement” (data will) with your designated archive ► Gather required original documents and make sure your archive can accept them Drawings Calibration plans and procedures Science Team minutes Source Code ► Arrange peer review of documentation Summary ► Our data will not survive without careful thought to ensure Physical insight into the measured variables and the measurement process Adequate public access to the measurement process Cost-effective archival ► Archives know less than you do about your data; if you don’t act to preserve that information, archives can’t preserve it!