Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
NR 422 Quality Control Jim Graham Spring 2009 Staircase of Knowledge Wisdom Judgment Knowledge Information Data Observation And Measurement Comprehension Integration Organization Interpretation Selection Testing Verification Human value added Understanding Increasing Subjectivity Environmental Monitoring and Characterization, Aritola, Pepper, and Brusseau Error • Data does not match reality (ever) • Gross errors • Accuracy (bias): distance from truth – | Measurement mean – Truth | • Precision: variance within the data – Standard Deviation (stddev) • Measurement Limits Accuracy and Precision High Accuracy Low Precision Low Accuracy High Precision http://en.wikipedia.org/wiki/Accuracy_and_precision Bias (Accuracy) • Bias = Distance from truth Bias Truth Mean Standard Deviation (Precision) Each band represents one standard deviation Source: Wikipedia Other Approaches • Confidence Intervals • +- Some range (suspect) Sources of Error • Measurement Error – Protocol – User – Instrument • Processing Errors – Procedure – User – Instrument • Data Errors – Age – Metadata/Documentation Protocol • Rule #1: Have one! • Step by step instructions on how to collect the data – Calibration – Equipment required – Training required – Steps – QAQC • See Globe Protocols: – http://www.globe.gov/sda/tg00/aerosol.pdf Protocol Error • • • • Is there a protocol? What is being measured? Is it complete: How large? How small? Unexpected circumstances (illness, weather, accidents, equipment failures, changing ecosystems) User Measurement Errors • • • • • • Wrong Datum Data in wrong field/attribute Missing data Gross errors Precision and Accuracy Observer error: expertise and “drift” Instrument Errors • Calibration • Drift • Humans as instruments: – DBH – Weight – Humans are almost always involved! – Fortunately we can be calibrated and have our drift measured Calibration • Sample a portion of the study area repeatedly and/or with higher precision – GPS: benchmarks, higher resolution – Measurements: lasers, known distances – Identifications: experts, known samples • Use bias and stddev throughout study • Also provides an estimate for min/max Flow of error • Capture error during data collection • Determine error of other datasets – If unavailable, estimate the error • Maintain error throughout processing – Error will increase • Document final error in reports and metadata Processing Error • Error changes with processing • The change depends on the operation and the type of error: – Min/Max – Average Error – Standard Error of the Mean – Standard Deviation – Confidence Intervals Combing Bias • Add/Subtraction: – Bias (Bias1+Bias2)= • T- (Mean1*Num1+Mean2*Num2)/(Num1*Num2) • Simplified: (|Bias1|+|Bias2|)/2 • Multiply Divide: – Bias (Bias1*Bias2)= • T- (Mean1*Mean2) • Simplified: |Bias1|*|Bias2| Derived by Jim Graham Combining Standard Deviation • Add/Subtract: – StdDev=sqrt(StdDev1^2+StdDev2^2) • Multiply/Divide: – StdDev= • sqrt((StdDev1/Mean1)^2+(StdDev2/Mean2)^2) http://www.rit.edu/cos/uphysics/uncertainties/Uncertaintiespart2.html Exact numbers • Adding/Subtracting: – Error does not change • Multiplying: – Multiply the error by the same number Significant Digits (Figures) • How many significant digits are in: – 12 – 12.00 – 12.001 – 12000 – 0.0001 – 0.00012 – 123456789 • Only applies to measured values, not exact values (i.e. 2 oranges) Significant Digits • Cannot create precision: – 1.0 * 2.0 = 2.0 – 12 * 11 = 130 (not 131) – 12.0 * 11 = 130 (still not 131) – 12.0 * 11.0 = 131 • Can keep digits for calculations, report with appropriate significant digits Rounding • If you have 2 significant digits: – 1.11 -> ? – 1.19 -> ? – 1.14 -> ? – 1.16 -> ? – 1.15 -> ? – 1.99 -> ? – 1.155 -> ? Quality Control/Assurance • • • • Calibrate “Instruments” Perform random checks on data Watch for “drift” Document all errors in Metadata! Design of Sampling • • • • • Random Stratified random Clustered Systematic Iterative Number of Samples • 30? • Figure 2.7 from Environmental Monitoring and Characterization Statistical Studies • Is the sampling really random or uniform? – Bias – “Most data is collect near a road, a portapoty, and a restaurant!” – Tom Stohlgren Plots in RMNP Plots in GSENM Spatial Autocorrelation • Used to determine type of sampling Rounding • If you have 2 significant digits: – 1.11 -> 1.1 – 1.19 -> 1.2 – 1.14 -> 1.1 – 1.16 -> 1.2 – 1.15 -> 1.1 – 1.99 -> 2.0 – 1.155 -> 1.5