Download Introduction

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pattern recognition wikipedia , lookup

Data analysis wikipedia , lookup

Geographic information system wikipedia , lookup

Spatial analysis wikipedia , lookup

Data assimilation wikipedia , lookup

Transcript
Title
Methodological considerations in fine-scale spatial analysis:
point pattern investigation of discarded syringes used in
public injection of illicit drugs
Mapping and Analysis for Public Safety
September 2005
Savannah, Georgia
Luc de Montigny ([email protected])
University of Washington
Urban Design and Planning
Outline
Overview of the case study – the big picture
Issues associated with fine-scale analysis of point data
• Constrained opportunity surface
• Non-observations as data
Investigation of two novel approaches
• Kernel Density Estimate ratios
• Random Labeling (SPPA)
Discussion and conclusion
Luc de Montigny – MAPS – 2005
Context: the Case Study
Analyze the distribution of syringes found in the most
active hard-drug use neighborhood of Montréal, Canada.
Research Questions
• Where do “needle-drops” cluster?
• Why are some areas more affected
than others?
• How effective are current
interventions (drop-boxes, NEP)?
Ultimate Goals
• Understand public injection
behavior
• Educate CPTED initiatives
Syringes: n=4,172
Luc de Montigny – MAPS – 2005
Macro-Scale Analysis
Crime – like disease – is often
analyzed for large areas (states,
counties, cities).
Large extents usually mean low
resolution (big units of analysis) and
aggregation of data.
Discrete events are pooled; point
values become area counts
(points -> surface).
Traditional geo/spatial statistical
analyses can be used. Underlying
assumptions effectively hold.
Luc de Montigny – MAPS – 2005
Micro-Scale Analysis
There are compelling reasons to push analysis to finer
spatial resolutions.
• Substantive (analysis scale = intervention scale)
• Methodological (MAUP)
Fine-scale analysis introduces new challenges to old tools.
• Why?
• What to do about it?
Luc de Montigny – MAPS – 2005
Methodological Implications of Micro-Scale Analysis
Crime events are not points sampled from a continuous
surface; they represent observations of discrete events.
This is a different type of pattern resulting from different
types of processes.
This distinction has implications, two of which are
discussed:
1. Non-observations constitute useful data
2. The area of opportunity may not be continuous
Luc de Montigny – MAPS – 2005
1) Non-Observations as Data
Assuming an exhaustive sampling strategy (e.g.,
documentation of all police reports), units of analysis that
do not host an event represent a none-event.
There is a difference between “zero” and “no data.”
• Problem: a useful source
of information is ignored
• Proposed Solution:
borrow case/control
approaches developed in
epidemiology
Luc de Montigny – MAPS – 2005
Using non-observations: Random Labeling
Comparison of the spatial distribution of events (cases) to
the spatial distribution of non-events (controls).
• Cases: points where syringes were found
• Controls: random points where syringes were not found
Used to assess whether clustering in the events is greater
than what is expected due to environmental heterogeneity.
• Here we use Ripley’s K function to summarize the
spatial point patterns.
• D (d ) = Kcases(d ) — Kcontrols(d )
Luc de Montigny – MAPS – 2005
Random Labeling – Significance
To assess the significance of the difference between
Kcases(d ) and Kcontrols(d ), generate simulation envelopes:
• pool the points (cases + controls);
• randomly assign “case” status to ncase points;
• and calculate the summary function;
• repeat X number of times.
• The maximum and minimum values for each distance
bin are taken from all iterations of the simulation.
Under the null hypothesis:
• Kcases(d ) = Kcontrols(d ) = Krandom assignment(d )
Luc de Montigny – MAPS – 2005
Random Labeling – Results
Non-flat curve* indicates
difference between spatial
distribution of cases from
distribution of controls:
clustering over and above
that of environmental
heterogeneity.
Peaks outside the simulation
envelope should be
considered significant.
K1: observations (cases)
K2: non-observations (controls)
ˆ (d )=Kcases(d ) - Kcontrols(d )
*D
Luc de Montigny – MAPS – 2005
2) Constrained Opportunity Surface
In many situations, events
can occur in some spaces,
but not others.
• Problem: increased
likelihood of type II error
• Proposed Solution:
constrain the opportunity
surface to the area where
events can be observed,
i.e. explicitly define a
spatial sample frame
Luc de Montigny – MAPS – 2005
Delimiting the Sample Frame
Where can syringes be found?
• Alleys and sidewalks
• Parks
• Parking lots and vacant lots
Sample frame ≈ 0.3 Study area
Luc de Montigny – MAPS – 2005
Cluster Analysis using Kernel Density Estimates (KDE)
KDE is a form of surface modeling – values are estimated
for locations between data points.
KDE can be extended to estimate the intensity of one type
of point data relative to another.
• In epidemiology, KDE are calculated for both events
(cases) and for populations at risk (controls), to control
for uneven distributions of population.
• This approach can be adapted for use here: density of
events (distribution of syringes) can be normalized by
density of opportunity (distribution of sample frame).
Luc de Montigny – MAPS – 2005
KDE – Syringe Points
The “smoothed” surface
represents the intensity of
discarded syringes within the
search radius, or bandwidth
(100m) of any given location in
the study area (i.e., for every
grid cell).
PVC lines represent the boundary of
the area that contains 90% of the
volume of a probability density
distribution; on average 90% of the
points that were used to generate the
KDE are contained within the lines.
Luc de Montigny – MAPS – 2005
KDE – Sample Frame
Here the sample frame is converted to
a grid (10m), and the centroid of each
cell is used for the purposes of the
kernel density estimation.
Luc de Montigny – MAPS – 2005
KDE – Syringe/Sample Ratio
The ratio surface represents, for each
grid cell, the syringe point KDE value
divided by the square of the sample
frame KDE value.
Luc de Montigny – MAPS – 2005
KDE – Comparison
A comparison of how
syringe points cluster in the
study area (simple density
estimate), to how those
same points cluster within
the sample space (the ratio
between the two density
estimates).
These results suggest that
the distribution (clustering)
of syringes is due to factors
other than the distribution
of opportunity.
Luc de Montigny – MAPS – 2005
Caveats and Limitations
Random Labeling
• Huge departure from envelope is due to mis-specifying
the null hypothesis (only “proving” the obvious) – should
use different null hypothesis.
• K function assumes stationarity; probably violated in this
case – should use inhomogenous function.
Kernel Density Estimate ratios
• The intensity of opportunity (sample space density
estimates) is measured in an arbitrary way. The choice of
grid resolution, and bandwidth size are influential to the
density estimate, yet are not grounded in theory.
• Density surfaces should be “clipped” to areas within the
sample frame for the purposes of visualization and
analysis.
Luc de Montigny – MAPS – 2005
Summary
• Most events studied in criminology are the result of point
processes (point patterns).
• Tools designed for the analysis of surfaces may not be
appropriate for criminology.
• Popular analytic techniques have underlying assumptions
that are violated at the micro-scale.
• Ignoring the above can result in erroneous results (type II
error, model mis-specification)
Contact information:
[email protected]
Luc de Montigny – MAPS – 2005
Acknowledgements
This research would not be possible without the hard work and
collaboration of Spectre de rue, Montréal.
Luc de Montigny – MAPS – 2005
Selected Reading & Software References
• Gatrell AC, Bailey TC, Diggle PJ, Rowlingson BS (1996) Spatial point
pattern analysis and its application in geographic epidemiology.
Transactions of the Institute of British Geographers 21: 256-274
• Walter C, McBratney AB, Viscarra Rossel RA, Markus JA (2005) Spatial
point-process statistics: concepts and application to the analysis of
lead contamination in urban soil. Environmetrics 16: 339-355
• Beyer HL (2005) Hawth's Analysis Tools for ArcGIS.
Available at http://www.spatialecology.com/htools
• Rowlingson BS, Diggle PJ, Bivand R (2005) The splancs package for R.
Available at http://www.maths.lancs.ac.uk/~rowlings/Splancs
• Baddeley A, Turner R (2005) The spatstat package for R.
Available at http://www.maths.uwa.edu.au/~adrian/spatstat
• Lewin-Koh NJ, Bivand R (2005) The maptools package for R.
Available at http://cran.r-project.org
Luc de Montigny – MAPS – 2005
Appendix – Ripley’s K Summary Function
The K-function describes the degree to which there is spatial dependence in
the arrangement of events
K(d) = λ-1E[number of events within d from a randomly selected event]
Where λ is the intensity, and E[] the expectation
Formally:
Where,
• R is the region (extent)
• I is a binary indicator function
• w is the proportion of the search radius that falls within R
Luc de Montigny – MAPS – 2005