Download Lecture 8

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Spatial analysis wikipedia, lookup

History of statistics wikipedia, lookup

Foundations of statistics wikipedia, lookup

Misuse of statistics wikipedia, lookup

Time series wikipedia, lookup

• Statistical options tend to be limited in most GIS
• This is likely to be redressed in the future.
• We will look at spatial statistics in general terms, and
conclude with a review of the software available.
Basic Concepts
Spatial statistics differ from ‘ordinary’ statistics by the
inclusion of locational properties.
This makes spatial statistics more complex.
The book by Bailey and Gatrell (1995) provides an
accessible introduction. They identify four categories:
– Point pattern data;
– Spatially continuous data;
– Areal data; and
– Interaction data.
Obvious correspondence with conceptual models.
Scale Levels
Attribute data can be classified by measurement scale:
– Nominal: e.g. 1=females, 2=males.
– Ordinal: e.g. 1=good, 2=medium, 3=poor.
– Interval (+ ratio): e.g. degrees Centigrade,
Bailey and Gatrell classify techniques by purpose:
– Visualisation
– Exploration
– Modelling – this is involved in all statistical inference
and hypothesis testing)
Random Variables
Statistical models deals with phenomena that are
stochastic (i.e. are subject to uncertainty).
A random variable Y has values that are subject to
uncertainty (but may not necessarily be random).
The distribution of possible values is referred to as the
probability distribution.
Represented by a function fY(y)
Random variables may be discrete or continuous.
Probability that y is between a and b is given by:
 f y
if Y is discrete
 f  y dy
if Y is continuous (probability density)
y a
Cumulative probability (or distribution function) FY is
given by:
FY  y  
 f u 
u  
FY  y    fY u du
if Y is discrete
if Y is continuous
Expected Values
• The expected value of Y is its mean E(Y):
E Y    y. f  y 
y  
E Y    y. fY  y dy
• The expected value of a function of Y, say g(Y) is :
E  g Y    g  y . f  y 
Eg Y    g  y . fY  y dy
• Variance is: VAR(Y) = S([Y - E(Y)]2)
• The square root of this is the standard deviation (sY)
Joint Probability
• Can generalise to situations where there is more than one
random variable.
• Joint probability distribution (or density): fXY(x,y)
• Covariance:
COV(X,Y) = S((X - E(X)).(Y - E(Y)))
• Correlation:
rX,Y = COV(X,Y) /
• Independence: Neither variable affects the other. Joint
probability is product of individual probabilities:
Statistical Models
• A statistical model specifies the probability distribution for
the phenomenon being modelled.
• If modelling ozone levels in a region R we would have a
probability distribution for each location s (where s is a
2x1 vector of x,y coordinate pairs). Individual points can
be referred to as s1, s2 etc.
• The complete set of random variables may be referred to as
a spatial stochastic process.
• The probability distribution for near points will probably
be more similar than for distant points, so our random
variables will probably not be independent.
Specifying Models
• To specify a model we need to specify its probability
distribution. For the ozone model we would need to
specify the joint distribution of every possible subset of
random variables.
• For a fair die: fY(y) = 1/6
• For more complex models (e.g. ozone) we can use
observed data: (y1, y2, …)
• These data are a realistion – i.e. one outcome from the
joint probability distribution {Y1, Y2, …}
• One set of data does not get us very far. Even with more
data observations we must make reasonable assumptions,
based either on theory or prior observations.
Specifying Models(2)
• Assumptions may be expressed in general terms (e.g. a
Normal distribution, a regression model) with unspecified
• The model can be fitted using observed data to estimate
the parameters.
• After evaluating the model we may decide to change its
general form.
A Regression Model
• To illustrate, to model our ozone data we might make the
following assumptions:
– The random variables {Y(s), s  R} are independent;
– They have the same distribution, but different means;
– Their means are a simple linear function of location,
say E(Y(s)) = b0 + b1s1 + b2s2;
– Each Y(s) has a normal distribution about this mean
with the same variance s2.
• These assumptions would enable us to estimate the
parameters from the available data.
Maximum Likelihood
• Most frequently used method is maximum likelihood.
• We can write down the general form of the joint
probability distribution e.g. f(y1,y2, … yn; q ) where q is a
vector of parameters - (b0, b1, b2, s2) in our regression
• Given that we have actual values for y1… yn, this joint
probability distribution is the probability of getting these
actual values. This is referred to as the likelihood and
would usually be denoted L(y1, y2, … yn; q).
• Our objective is to identify the parameter values q that
maximise L. In practice we usually maximise the logarithm
of L (log likelihood) denoted l(y1, y2, … yn; q).
Parameter Estimation
• This is the basic approach, but the actual estimation may
be complicated.
• Parameter estimation of our multiple linear regression
involving assumptions of independence, normal
distributions and equal variance reduces to using the
method of ordinary least squares.
• Relaxing the independence and equal variance, we can still
use generalised least squares.
• Standard errors provide a measure of the reliability of
each parameter estimate.
• Likelihood ratios can be used to compare alternative
Hypothesis Testing
• Hypothesis testing entails comparing the fit of two models,
one of which incorporates assumptions which reflect the
hypothesis, the other incorporating a less specific set of
• All modelling inevitably involves some assumptions about
the phenomenon under study; hence hypothesis testing will
always involve comparison of the fit of a hypothesised
model with that of an alternative which also incorporates
assumptions, albeit of a more general nature.
Spatial Data Modelling
• Spatial data often exhibit spatial correlation (or
autocorrelation). Assumptions of independence may
therefore be unrealistic.
• Can make a distinction between:
– First order effects: variation in the mean due to global
– Second order effects: caused by spatial correlation.
• Can illustrate using analogy of iron filings and magnets.
• Real-world patterns are often an outcome of a mix of first
and second order effects.
Spatial Data Modelling(2)
• To allow for second order effects, spatial models may need
to assume a covariance structure.
• The second order effects may be modelled as a stationary
spatial process – i.e.
– Its statistical properties (mean, variance) are
independent of absolute location;
– Covariance depends only on relative location.
• A process is said to be isotropic if it is stationary, and
covariance depends only on distance and not direction.
• If the mean, variance or covariance ‘drifts’ over the study
area, then the process exhibits non-stationarity or
Spatial Data Modelling(3)
• Heterogeneity in the mean, combined with stationarity in
second order effects, is a useful spatial modelling
• The modelling of a spatial process often tends to proceed
by first identifying any heterogeneous 'trend' in mean value
and then modelling the 'residuals', or deviations from this
'trend', as a stationary process.
Geographically Weighted Regression
• Covariates are often incorporated in a multiple regression
model taking the general form:
yi  b 0   b k xik   i
• The model assumes the coefficients are homogeneous or
• Fotheringham et al. proposed an alternative model:
y  b u , v    b u , v x  
• To allow the model to be fitted, it is assumed the
parameters are non-stationary but are a function of
• Parameters can be mapped.
Point Pattern Techniques
Bailey and Gatrell discuss various techniques, organised
by data type.
Point pattern techniques include:
Quadrat analysis
Kernel estimation
Nearest neighbour analysis
Normally used to test null hypothesis of complete
spatial randomness (i.e. homogeneous Poisson
process), but can also examine heterogeneous Poisson
Spatially Continous Data
Techniques used to explore field data.
Sometimes referred to as geostatistics.
Spatial moving averages
Trend surface analysis
Delauney triangulation / Thiesen polygons / TINs
Kernel estimation (for the values at sample points)
Variograms / covariograms / kriging
Principal components analysis / factor analysis
Procrustes analysis
Cluster analysis
Canonical correlation
Area Data
Techniques for analysing areal data (i.e. polygon
attributes) include:
Spatial moving averages
Kernel estimation
Spatial autocorrelation (Moran’s I, Geary’s c)
Spatial correlation and regression
Generalised linear models provide a family of techniques
for dealing with special types of data: e.g. counts
(Poisson regression), proportions (logistic regression).
Bayesian techniques often used to model rates based on
small numbers.
Spatial Interaction Data
• Techniques for modelling spatial interactions are most
based on some variant of the gravity model.
• This postulates that the amount of interaction between two
places is a function of their sizes (measured using an
appropriate metric) and is inversely related to the distance
between them.
ArcGIS. Geostatistical Analyst a step forward.
Idrisi. GIS Analysis | Statistics menu has a lot of options.
S-Plus. The S+SpatialStats addon provides a lot of options.
R. R is an open-source version of S-Plus. There are a
number of projects currently developing tools for spatial
statistics (e.g. sp, spatstat, DCluster, spgwr).
• BUGS. Software for Bayesian statistics. There is a free
version for Windows (WinBUGS). Includes a spatial subset called GeoBUGS.