Download 7.1 Error

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
NR 422
Quality Control
Jim Graham
Spring 2009
Staircase of Knowledge
Wisdom
Judgment
Knowledge
Information
Data
Observation
And
Measurement
Comprehension
Integration
Organization
Interpretation
Selection
Testing
Verification
Human value added
Understanding
Increasing Subjectivity
Environmental Monitoring and Characterization, Aritola, Pepper, and Brusseau
Error
• Data does not match reality (ever)
• Gross errors
• Accuracy (bias): distance from truth
– | Measurement mean – Truth |
• Precision: variance within the data
– Standard Deviation (stddev)
• Measurement Limits
Accuracy and Precision
High Accuracy
Low Precision
Low Accuracy
High Precision
http://en.wikipedia.org/wiki/Accuracy_and_precision
Bias (Accuracy)
• Bias = Distance from truth
Bias
Truth
Mean
Standard Deviation (Precision)
Each band represents one standard deviation
Source: Wikipedia
Other Approaches
• Confidence Intervals
• +- Some range (suspect)
Sources of Error
• Measurement Error
– Protocol
– User
– Instrument
• Processing Errors
– Procedure
– User
– Instrument
• Data Errors
– Age
– Metadata/Documentation
Protocol
• Rule #1: Have one!
• Step by step instructions on how to
collect the data
– Calibration
– Equipment required
– Training required
– Steps
– QAQC
• See Globe Protocols:
– http://www.globe.gov/sda/tg00/aerosol.pdf
Protocol Error
•
•
•
•
Is there a protocol?
What is being measured?
Is it complete: How large? How small?
Unexpected circumstances (illness,
weather, accidents, equipment failures,
changing ecosystems)
User Measurement Errors
•
•
•
•
•
•
Wrong Datum
Data in wrong field/attribute
Missing data
Gross errors
Precision and Accuracy
Observer error: expertise and “drift”
Instrument Errors
• Calibration
• Drift
• Humans as instruments:
– DBH
– Weight
– Humans are almost always involved!
– Fortunately we can be calibrated and have
our drift measured
Calibration
• Sample a portion of the study area
repeatedly and/or with higher precision
– GPS: benchmarks, higher resolution
– Measurements: lasers, known distances
– Identifications: experts, known samples
• Use bias and stddev throughout study
• Also provides an estimate for min/max
Flow of error
• Capture error during data collection
• Determine error of other datasets
– If unavailable, estimate the error
• Maintain error throughout processing
– Error will increase
• Document final error in reports and
metadata
Processing Error
• Error changes with processing
• The change depends on the operation
and the type of error:
– Min/Max
– Average Error
– Standard Error of the Mean
– Standard Deviation
– Confidence Intervals
Combing Bias
• Add/Subtraction:
– Bias (Bias1+Bias2)=
• T- (Mean1*Num1+Mean2*Num2)/(Num1*Num2)
• Simplified: (|Bias1|+|Bias2|)/2
• Multiply Divide:
– Bias (Bias1*Bias2)=
• T- (Mean1*Mean2)
• Simplified: |Bias1|*|Bias2|
Derived by Jim Graham
Combining Standard Deviation
• Add/Subtract:
– StdDev=sqrt(StdDev1^2+StdDev2^2)
• Multiply/Divide:
– StdDev=
• sqrt((StdDev1/Mean1)^2+(StdDev2/Mean2)^2)
http://www.rit.edu/cos/uphysics/uncertainties/Uncertaintiespart2.html
Exact numbers
• Adding/Subtracting:
– Error does not change
• Multiplying:
– Multiply the error by the same number
Significant Digits (Figures)
• How many significant digits are in:
– 12
– 12.00
– 12.001
– 12000
– 0.0001
– 0.00012
– 123456789
• Only applies to measured values, not
exact values (i.e. 2 oranges)
Significant Digits
• Cannot create precision:
– 1.0 * 2.0 = 2.0
– 12 * 11 = 130 (not 131)
– 12.0 * 11 = 130 (still not 131)
– 12.0 * 11.0 = 131
• Can keep digits for calculations, report
with appropriate significant digits
Rounding
• If you have 2 significant digits:
– 1.11 -> ?
– 1.19 -> ?
– 1.14 -> ?
– 1.16 -> ?
– 1.15 -> ?
– 1.99 -> ?
– 1.155 -> ?
Quality Control/Assurance
•
•
•
•
Calibrate “Instruments”
Perform random checks on data
Watch for “drift”
Document all errors in Metadata!
Design of Sampling
•
•
•
•
•
Random
Stratified random
Clustered
Systematic
Iterative
Number of Samples
• 30?
• Figure 2.7 from Environmental Monitoring
and Characterization
Statistical Studies
• Is the sampling really random or uniform?
– Bias
– “Most data is collect near a road, a portapoty, and a restaurant!” – Tom Stohlgren
Plots in RMNP
Plots in GSENM
Spatial Autocorrelation
• Used to determine type of sampling
Rounding
• If you have 2 significant digits:
– 1.11 -> 1.1
– 1.19 -> 1.2
– 1.14 -> 1.1
– 1.16 -> 1.2
– 1.15 -> 1.1
– 1.99 -> 2.0
– 1.155 -> 1.5
Related documents