Download Answers_ProblemSetNo7

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Spatial analysis wikipedia , lookup

Transcript
Answer Key to Problem Set #7
Geog 3000: Advanced Geographic Statistics
Instructor: Dr. Paul C. Sutton
Problem Set Number 7 focuses on descriptive spatial statistics and spatial interpolation methods.
These problems draw from information in Chapter 3 & 4 of the McGrew & Monroe text and
from Chpater 16 of the Chang Intro GIS text I have included in the zip file for this problem set.
Kriging and other spatial interpolation techniques are mathematically complex (for most of us
anyway). These exercises focus on conceptual rather than computational understanding. You can
teach a whole graduate course in Kriging and spatial interpolation. We are skimming over some
of the important basic concepts. For most of us, these computations take place with the click of a
button. Understanding what these computations are doing and interpreting the results of these
computations is what I am trying to get across with these exercises. Good Luck!
Read all about Pirates and Global Warming at the Church of the Flying Spaghetti Monster
(http://www.venganza.org/about/open-letter/ ). Is correlation causality?
#1) What is a Coefficient of Variation anyway?
Suppose the length of Pine Beetles is distributed normally N(1 cm, 0.1 cm). [That’s mean and
standard deviation BTW]. Also assume that garter snakes are distributed normally also N(60 cm,
3 cm). Which of these two creatures has a greater variability of length? How can you
appropriately compare apples and oranges? What statistics would you use to demonstrate this?
The standard deviation of the Pine Beetle is 10% (.1 cm/1 cm) of the average length of a
Pine Beetle. The standard deviation of the Garter snake is only 5% (3 / 60) of its average
length. By this measure (the coefficient of variation) the Pine Beetle has greater variability
of length.
Wikipedia gives a pretty good definition:
In probability theory and statistics, the coefficient of variation (CV) is a normalized measure
of dispersion of a probability distribution. It is defined as the ratio of the standard deviation
to the mean :
This is only defined for non-zero mean, and is most useful for variables that are always
positive. It is also known as unitized risk or the variation coefficient.
The coefficient of variation should only be computed for data measured on a ratio scale. As an
example, if a group of temperatures are analyzed, the standard deviation does not depend on
whether the Kelvin or Celsius scale is used. However the mean temperature of the data set
would differ in each scale and thus the coefficient of variation would differ. So the coefficient
of variation does not have any meaning for data on an interval scale.[1]
#2) Descriptive Spatial Statistics
Cosider the following Point locations in the scatterplot below. Note: the coordinates of these
points are provided in the table on the right.
A) What is the mean center of these points?
Xmc = 3.86 Ymc = 3.5
B)
What is the weighted mean center of these points if you use their Z value as the weight?
Xwmc = 3.493 Ywmc = 3.145
C)
How can the mean center be interpreted?
The mean center is the location that minimizes the squared distances to each point.
D)
How can the weighted mean center be interpreted?
The mean center minimizes the (squared distance to each point times that points respective
z-value). The effect is to move the mean center closer to more heavily weighted points.
#3) Emergency Services for Prairie Dogs
Suppose you have been charged with providing ambulance service to seven prairie dogs
scattered on a prairie called isotropia. The location of these prairie dogs is given by the
following table of coordinates. Where should you
locate
X
Y
Prairie
Dog
your prairie dog ambulance if you wanted to
1
1
Fred
minimize the average distance you would have to
drive to
3
7
Barney
get to these prairie dog locations? (NOTE:
provide
9
11
Wilma
a verbal descriptive answer here not a numerical
one).
4
2
Betty
Assume all seven prairie dogs have equal
8
8
Sam
probability of needing ambulance services and
that on
2
10
Elma
9
4
Bob
the prairie of ‘isotropia’ the shortest distance (e.g.
fastest
prairie dog ambulance ride) between two points is
a
straight line. How does one go about obtaining a numerical answer to this question? How is this
problem different than the spatial mean problems you explored in question #2?
You should locate your prairie dog ambulance at the Euclidean median of these prairie dog
locations. The Euclidean median is different than the spatial mean in that it minimizes
unsquared rather than squared distances to each of the seven prairie dogs. See this section
in Chapman and McGrew text (pg 55):
"For many geographic applications, another measure of "center" is more useful than the mean
center. Often, it is more practical to determine the central location that minimizes the sum of
unsquared, rather than squared, distances. This location, which minimizes the sum of Euclidean
distances from all other points in a spatial distribution to that central location, is called the
Euclidean median, (Xe, Ye), or median center. Mathematically, this location minimizes the sum:
(see equation 4.6) Unfortunately, determining coordinates of the Euclidean median is
methodologically complex. Computer-based iterative algorithms (step by step procedures) must
be used to reach a solution. These algorithms evaluate a sequence of possible coordinates and
gradually converge on the best location for the Euclidean median. "
#4) The First Law of Geography
Y
Waldo Tobler is often credited for positing the First Law of Geography (aka Tobler’s Law)
which states: “Everything is related to everything else but near things are more related than
distant things.” It can be argued that the first law is merely a qualitative statement concerning
the nature of spatial autocorrelation. Autocorrelation is typically described as the self-similarity
of phenomena in spatial or temporal domains. The Dow Jones Industrial Average (DJIA) can be
used to describe temporal autocorrelation. If the DJIA closed at 10,000 yesterday it is more
likely to close at or near 10,000 today than it is a year from now. Rainfall is a good way to
exemplify spatial autocorrelation. If it is raining at my house it is more likely to be raining at my
neighbor’s house than it is to be raining across town. Human beings seem to have a natural or
innate understanding of spatial and temporal autocorrelation (to some extent at least). If I show
my 10 year old son the graph on the right and tell him to guess the temperature at the location
marked with an ‘X’ based on the
other
Bivariate
Fit
of
Y
By
x
temperature measurements he will do
some
10
sort of mental spatial interpolation in
which
9
89
nearer values will have more
8
‘weight’ than distant values (he
guessed 71). Chapter 16 from the
Chang
7
110
text that accompanies this problem
set
6
does a good job at describing
5
X
numerous formal mathematical
means
4
for performing spatial interpolation
3
including: Inverse Distance
2
65
Weighting (IDW), splines,
1
polynomial trend surface curve
fitting,
1
2
3
4
5
6
7
8
9
10
and kriging. Kriging actually
x
involves mathematically
characterizing spatial autocorrelation by fitting a curve to a variogram. A variogram
characterizes variance (or its inverse – correlation) as a function of distance. For details on these
methods read the file: Ch16introGISkangtsungChang.pdf. Answer the questions on the
following pages that are associated with spatial interpolation.
A) Inverse Distance weighting is a simple spatial interpolation method. Global methods use all
the available control points to estimate a value at an unknown location. Local methods use all the
points within a fixed distance or the nearest ‘N’ neighbors or the nearest ‘N’ neighbors within a
given distance. Given the following points provide an estimate of the value of Z0 for the location
denoted by ‘X’ using:
1) Global IDW with exp = 1 and exp = 2;
2) Local IDW with exp = 1 using an inclusive Radius of 5.2.
3) Local IDW with exp = 1 using the three nearest neighbors
Bivariate Fit of Y By x
Global uses all 5 points. Local Radius 5.2 uses the 2nd and 4th points only. Local with three
nearest neighbors uses 1st point, 2nd point, and 4th point. If exp = 1 use simple distance, if
exp=2 use distance squared.
B) Given the image (raster/grid) below with only 6 known values use IDW to fill in all the
blanks. You’ll be advised to think like a programmer and do this in excel. Use Inverse Distance
Squared and go with a Global approach. Draw the interpolated image on the right (e.g. fill in the
blanks in the grid).
The Zestimates for the points in the grid above are pasted in on the right. The BOLDED
known points have been estimated via a method known as cross-validation (CV).
Essentially you simply remove the known point and use the other known points to estimate
the point you removed. The point at (x=1, y=4, z=1) had a CV estimate of 5.15. Over a
400% error. Below is an excel spreadsheet I used to calculate all these values. This is a tiny
and simple small dataset. I think it is clear that we all believe that computers and computer
programmers are GOOD .
C) How good do you think your IDW estimates of the empty cells in question ‘B” are? How
might you go about characterizing your ‘skill’ of estimation. How do you think the ‘skill’ (aka
accuracy) of estimate will vary as a function of distance from know values?
The example point at X=1, Y=4 Z=1 had an extimated Z of 5.15 when you ‘pretended’ that
you did not know its actual value was 1. That was a pretty bad estimate. A summary of the
rest of the errors for the know values is in this table:
In this case the average percentage error was over 100% of the actual value. This is not
very good. Error will probably be a function of the following: 1) the natural variability of
the phenomena being measured, 2) distance of known points to the location to be estimated,
and 3) the number of known points and their distribution in space.
#5) The Variogram: Characterizing Spatial Autocorrelation
Imagine 1,000 rainfall gauges scattered throughout the conterminous United States. On April
15th, 2004 you assemble all the rainfall measurements at these 1,000 stations. You have a table
that looks like this:
A) Describe in your own words how you would produce a variogram from the data above.
Building a variogram is called by some a ‘simple’ process. It took me a while to ‘grok’ this
‘simple’ procedure. I will try to explain it so it seems ‘simple’. Here we go: First of all you
have a huge table of points in space with Z values. Now – you have to create a table that has
all of the distances between all possible ‘pairs’ of points (e.g. Bodip to Bumwallop, Bodip to
Goofusville, Bodip to Teaneck, etc. AND Oxnard to Bumwallop, Oxnard to Goofusville,
Oxnard to Teaneck, etc. ). In other words, if you have a table of ONLY ten locations you
have a ‘Distance table’ with 9+8+7+6+5+4+3+2+1 = 45 distances - for a semi-variogram (90 for a ‘full’ variogram because Bodip to Bumwallop is not the same as Bumwallop to
Bodip – whatever ). 10 locations making 45 distances might not seem so bad but this is
non-linear with increases in point locations. This ‘distance table’ can get pretty big pretty
fast. So imagine this table and imagine ‘sorting’ the table on the ‘Distance’ column (see
figure below – There may be logical inconsistencies in the tables below that you might be
able to identify with multi-dimensional scaling techniques – go with it as an example for
this purpose though, it should work for that):
Sort the above Table on ‘Dist A to B’
To get something like the above – Of course there will be more records inbetween all these
distance values. Now you do something similar to ‘Binning’ for a histogram. You have
sorted by distance between points and you ‘Bin’ on that column (e.g. all the ‘pairs of
points’ that are from 141 to 150 units apart). Let’s say that bin (141-150) has 30 pairs of
points in it. You can calculate a ‘mean’ and a ‘variance’ of the difference between their Z
values (Also, if you wanted to create a Correlogram you could calculate the correlation ( R
) between the Z values for each of these paired points). In any case, Lather rinse repeat for
paired points that have distances of 131 to 140, 121 to 130, 111 to 120 …… 0 to 10 (or
whatever ‘bin size’ you choose). You should now be able to build a table that looks like this
(Variance or Semi-Variance or Correlation as a function of distance):
Note: Variance typically increases with distance whereas Correlation decreases with
increasing distance. With the information in the table above you can plot variance or
correlation as a function of distance which in this case will be ‘bin center’. Once this is done
you get to ‘fit’ that curve with a line, a Gaussian bell shaped curve, or other curve forms.
This is how you characterize self-similarity of spatially distributed numbers in space. This
fitted curve becomes a look up table (e.g. you use a calculated distance to ‘look up’ a
variance or correlation) used in the spatial interpolation technique known as Kriging.
B) Explain what a variogram and/or correlogram is and how it is used in the spatial
interpolation process known as kriging.
A variogram (or correlogram) is a device that characterizes spatial auto-correlation. It is in
essence a quantification of Tobler’s law. Kriging assumes that certain spatially variable
phenomena (ore grade for example – it was developed by mining and geologic engineers)
have three components of variability: 1) Spatial Autocorrelation, 2) Large Overall Trends,
and 3) Random variation. The variogram characterizes the first (spatial autocorrelation) to
improve our ability to predict unknown quantities in space.
C) Draw a generic variogram for a spatially autocorrelated phenomenon and label the
‘Range’, ‘Nugget’, and ‘Sill’. Provide a conceptual explanation of these terms using a
specific example of a spatially
autocorrelated phenomena such as
temperature.
The nugget is the ‘natural variation’ of the
phenomena that can occur at zero distance.
Imagine a room at constant temperature that
you measured the temperature of over and
over again. If you got a mean of 72 degrees
with a variance of .02 degrees your nugget
would be .02 (or the reliability of your
thermometer might be suspect).
Figure taken from a paper posted to the discussion board at this URL:
http://www.iasri.res.in/ebook/EBADAT/6-Other%20Useful%20Techniques/11-Spatial%20STATISTICAL%20TECHNIQUES.pdf
The sill represents the variability of the phenomena in the aggregate. If you measure the
earth at 1000 random points around the globe and got a mean of 72 degrees with a
standard deviation of 15 then your sill would be 15 degrees. The range represents the
distance over which knowing a value within that distance can inform your estimate of an
unknown location. For example: If you are trying to guess the temperature of Bodip,
Kansas and your variogram of temperature has a range of 300 km then you would need to
have a known temperature within 300 km of Bodip in order to have a spatially informed
estimate of the temperature. If you don’t have any known points within the range of your
variogram you might as well simply guess the mean of the phenomena you are trying to
estimate.
D) Suppose you generated an artificial ‘dataset’ using ‘X’ coordinates drawn from a
Uniform (0, 100) random variable, ‘Y’ coordinates from a Uniform(0,100) random
variable, and ‘Z’ values for these points from a Normal(100,15) random variable. What
would a variogram for this dataset look like? Draw one.
I’m too lazy to sketch this. Basically you would have a flat variogram in which the nugget
and sill had the same value. It would be a uniformly flat line of variance as a function of
distance at a height of 15 squared. If it were a correlogram it would be R = 0 no matter
what distance you choose.
#6) Cross-Validation, Bootstrapping, and Jack-Knifing Oh My!
In question #4-C you were asked how you might go about characterizing the accuracy or ‘skill’
of a particular spatial interpolation. A technique known by several names such as crossvalidation, bootstrapping, and jack-knifing has been formalized to characterize the accuracy or
‘skill’ of temporal and spatial interpolations.
A) Explain how cross-validation is performed.
Cross validation is done by ‘removing’ each known point (one at a time) and estimating its
value with the remaining known points. If you have ‘n’ points, you will have ‘n’ estimates
of those points also. This is a pretty good way of assessing the skill or accuracy of your
interpolation.
B) Explain how cross-validation can be used to characterize the accuracy of a spatial
interpolation?
For every known point you estimate its Z value. You can calculate all the standard statistics
for assessing error such as Mean Error, Mean Percentage Error, Mean absolute deviation,
Coefficient of Variation, etc.
C) Can the results of Cross-Validation be mapped? Is it possible to map levels or degrees of
confidence of interpolation?
Yes, you can map your errors at all known locations and then interpolate your errors to get
a map of estimated error. Kriging techniques are particularly good for this purpose.
D) How could you use cross-validation to compare the accuracy of two different spatial
interpolation techniques (e.g. contrasting the skill of an IDW interpolation to the skill of a
kriging interpolation of the same dataset).
Simply to a cross-validation on your data using both IDS and Kriging. It turns out that IDS
often defeats kriging in this benchmark because the assumption of stationarity that is
important to kriging is often not true. Stationarity is associated with the variogram that
characterizes spatial autocorrelation being constant across space. This is rarely true. And
geographic data isn’t independent either.
E) Conduct a cross-validation on the dataset provided in question #4-B.
This table pretty much sums it up……..