Download Agronomic Spatial Variability and Resolution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Regression toward the mean wikipedia , lookup

Categorical variable wikipedia , lookup

Transcript
Agronomic
Spatial Variability
and Resolution
What is it?
How do we describe it?
What does it imply for
precision management?
Agronomic Variability
• Fundamental assumption of precision
farming
• Agronomic factors vary spatially within a
field
• If these factors can be measured then
crop yield and/or net economic returns
can be optimize
Agronomic Variables
• Soils
–
–
–
–
Classification
Texture
Organic matter
Water holding capacity
• Topography
– Slope
– Aspect
• Fertility
–
–
–
–
–
pH
Nitrogen
Phosphorus
Potassium
Other nutrients
• Plant available water
• Crop Cultivar
Agronomic Variables
• Temperature
• Rainfall
• Weeds
– Species
– Population
• Insects
– Species
– Feeding patterns
• Tillage Practices
• Soil Compaction
• Diseases
– Macro and micro
environment
• Crop Stand
• Method and
Uniformity of
Application
– Fertilizers
– Crop protectants
What is variability
• Variability - difference in the magnitude of
measurements of a variable
– Values can change randomly because of error
in the sensor
– Systematic error or bias
– Values can change because of changes in the
underlying factor
• As time changes (Temporal)
• As location changes (Spatial)
Why statistically describe
measurements?
• Raw data sets are too large to understand
or interpret
• Statistics provide a means of summarizing
data and can be readily interpreted for
making management decisions
• Statistics can define relationships among
variables
Statistical Analyses Commonly
Used In Precision Agriculture
• Descriptive Statistics
– Measures of Central Tendency
• Mean
• Median
– Measures of Dispersion
• Range
• Standard Deviation
• Coefficient of Variation
• Normal Distributions
• Regression
• Geostatistics - Semivariance Analysis
Measures of Central Tendency
• When a factor, such as crop yield, is
measured at different locations within a
field, values may vary greatly
• This variation can appear to be random
• The set of these measurements is a
population
• A value exists that is the central or usual
value of the population
Measures of Central Tendency
• This is important because dimensions
representing Biological Material are
generally reported as single “expected”
values.
Mean or Average Value
• Most common measure of central
tendency
• Definition:
For n measurements X1,X2,X3,…,Xn
n
X 1 + X 2 +...+ X n
=
X =
n
X
i =1
n
i
Mean or Average
• The mean or average value is useful if the
measured value is normally distributed
(Bell Curve)
– Most biological processes are normally
distributed
– Spatially distributed measurements are often
not normally distributed
• To calculated the mean in Excel
= Average (Col Row:Col Row)
Definition of (Col Row : Col Row)
(Col Row:Col Row)
•
•
•
•
•
Column letter of the upper left cell
of an array of data
Row number of the upper left cell of
an array of data
Column letter of the lower right cell
of an array of data
Row number of the lower right cell
of an array of data
The “:” instructs Excel to include all
data between the two corner cells
The Median Value
• For skewed distributions, is the better
predictor of the expected or central value
• Calculated by ranking the values from
high to low
– For an odd number of measurements, the
median is middle value
– For an even number of measurements, the
median is average of the two middle values
• In Excel, the median is calculated using
the following formula:
= Median (Col Row : Col Row)
Normal vs. Skewed
Distribution
Mean
Skewed
Normal
Skewed
Median
Normal
Skewed
Normal
Normality
• Biological materials physical measurements are
generally normally distributed about the mean.
There are several test of normality which will be
discussed in your statistics courses. However,
three “quick and dirty” tests can be accessed
easily from Excel
• The first is simply comparing the mean and
median values. If the values are nearly the same
the measurement is likely distributed normally.
• Excel has function calls to calculate Skewness
and Kutrosis. These statistics can be used to test
for nomality
Normality
• Kurtosis measures deviation from the
mean. A value of ‘0’ indicates that there is
no deviation from a normal distribution. A
positive value indicates that more values
are clustered near the mean or far from it.
A negative value means a “flat” top of the
curve.
• = Kurt (Col Row : Col Row)
Normality
• Skewness is a measure of the tail of the
distribution. A positive value indicates
that there is an asymetrical tail of the
distribution and that it is positive. A
negative value indicates that there is a
negative tail to the distribution.
• =Skew (Col Row : Col Row)
Measures of Dispersion
• Measures of dispersion describe the
distribution of the set of measurements
Maximum and Minimum Values
• The maximum value is the highest value in
the data set
• In Excel the maximum value is calculated
by:
= Max(Col Row:Col Row)
• The minimum value is the lowest value in
the data set and is calculated by:
= Min(Col Row:Col Row)
Range of the Sample Set
• Difference between the maximum and
minimum values of the measurement
• Calculated in Excel by the following
formula:
= Max (Col Row:Col Row)
- Min (Col Row:Col Row)
Standard Deviation
• The standard deviation of a normally
distributed sample set is 1/2 of the
“range” or ≈68 %values for the population
n
s=
 (X
i =1
i
-X )
n -1
2
Standard Deviation
• For a normal distribution (Bell Curve)
≈ 95% of the samples from a population will lie in
the interval
X - 1.96s  Z  X + 1.96s
Where: X is the mean(average) value
Z is a value (measurement)
s is the standard deviation
• The standard deviation is calculated in Excel
using the following formula:
= Stdev (Col Row : Col Row)
Coefficient of Variation
• The magnitudes of the differences between large
values and their means tend to be large. The
differences between small values and their
means tend to be small.
• Consequently, a high yielding field is likely to
have a higher standard deviation than a low
yielding field, even if the variability is lower in the
high yield field or the same as the lower yielding
field.
Coefficient of Variation
• Thus, variation about two means of
different magnitudes cannot easily be
compared.
• Comparisons can be made by calculating
the relative variation, or the normalized
standard deviation.
• This measurement is called the Coefficient
of Variation.
Coefficient of Variation
• The Coefficient of Variation or C.V. is
calculated by dividing the standard
deviation of the data set by its mean.
Often that value is multiplied by 100 and
the C.V. is expressed as a percentage.
• Experience with similar data sets is
required to determine if the C.V. is
unusually large.
Mean, Standard Deviation and
Coefficient of Variation
Population = Y
Mean Plant
Spacing
CV =
Std. Dev. = s
s
X
Population = ½ Y
Mean Plant
= 2X
Spacing
2 (X - X )
2
Std. Dev. =
n -1
2
CV =
2s s
=
2X X
Correlation
• One objective of Biosystems engineering and
Agronomy is to alter the level of one variable (e.g.
soil nitrate) to change the response of another
variable (e.g. grain yield).
• There are other confounding factors affecting
grain yield, such as soil pH, which cannot always
be accounted for.
Correlation
• The engineer still needs to determine the
degree to which the two variables vary
together.
• The correlation coefficient or r is that
measure.
• The correlation coefficient, r, lies between
-1 and 1. Positive values indicate that X
and Y tend to increase or decrease
together.
Correlation
• Values of r near 0 indicate that there is little or no
relationship between the two variables.
• The coefficient of determination or r2 is important
in precision farming because, when the samples
are collected by location in the field, it indicates
the percentage of the variability in the dependent
variable (e.g. yield) explained by the independent
variable (e.g. N fertilizer).
Correlation
• For example, if the r2 of soil N and grain yield is
90% then 90% of the variability across the field
can be explained by soil nitrate. Spatially varying
the N fertilizer rate based on the nitrate level in
the soil should have a large effect on grain yield.
• In Excel, correlation r is calculate by the
following:
= Correl (Col Row : Col Row, Col Row:
Col Row)
To calculate r2, simply square the value of r.
Regression
• Excel has the capability of fitting
mathematical models (linear and nonlinear curves) to data which relate
dependent to independent variables.
Regression (curve fitting) can be
performed using the Charting GUI in
Excel. You can also directly calculate the
slope and intercept for a linear model
using the commands
Regression
• = Intercept (Col Row : Col Row)
and
• = Slope (Col Row : Col Row)
• Regression R2 is a measure in decimal
percent of how well the model fits the
data. For linear regression, the regression
R2 can be directly calculated be squareing
the correlation coefficient
High Resolution Variability
Study – 1 ft x 1ft Experiments