Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Title Methodological considerations in fine-scale spatial analysis: point pattern investigation of discarded syringes used in public injection of illicit drugs Mapping and Analysis for Public Safety September 2005 Savannah, Georgia Luc de Montigny ([email protected]) University of Washington Urban Design and Planning Outline Overview of the case study – the big picture Issues associated with fine-scale analysis of point data • Constrained opportunity surface • Non-observations as data Investigation of two novel approaches • Kernel Density Estimate ratios • Random Labeling (SPPA) Discussion and conclusion Luc de Montigny – MAPS – 2005 Context: the Case Study Analyze the distribution of syringes found in the most active hard-drug use neighborhood of Montréal, Canada. Research Questions • Where do “needle-drops” cluster? • Why are some areas more affected than others? • How effective are current interventions (drop-boxes, NEP)? Ultimate Goals • Understand public injection behavior • Educate CPTED initiatives Syringes: n=4,172 Luc de Montigny – MAPS – 2005 Macro-Scale Analysis Crime – like disease – is often analyzed for large areas (states, counties, cities). Large extents usually mean low resolution (big units of analysis) and aggregation of data. Discrete events are pooled; point values become area counts (points -> surface). Traditional geo/spatial statistical analyses can be used. Underlying assumptions effectively hold. Luc de Montigny – MAPS – 2005 Micro-Scale Analysis There are compelling reasons to push analysis to finer spatial resolutions. • Substantive (analysis scale = intervention scale) • Methodological (MAUP) Fine-scale analysis introduces new challenges to old tools. • Why? • What to do about it? Luc de Montigny – MAPS – 2005 Methodological Implications of Micro-Scale Analysis Crime events are not points sampled from a continuous surface; they represent observations of discrete events. This is a different type of pattern resulting from different types of processes. This distinction has implications, two of which are discussed: 1. Non-observations constitute useful data 2. The area of opportunity may not be continuous Luc de Montigny – MAPS – 2005 1) Non-Observations as Data Assuming an exhaustive sampling strategy (e.g., documentation of all police reports), units of analysis that do not host an event represent a none-event. There is a difference between “zero” and “no data.” • Problem: a useful source of information is ignored • Proposed Solution: borrow case/control approaches developed in epidemiology Luc de Montigny – MAPS – 2005 Using non-observations: Random Labeling Comparison of the spatial distribution of events (cases) to the spatial distribution of non-events (controls). • Cases: points where syringes were found • Controls: random points where syringes were not found Used to assess whether clustering in the events is greater than what is expected due to environmental heterogeneity. • Here we use Ripley’s K function to summarize the spatial point patterns. • D (d ) = Kcases(d ) — Kcontrols(d ) Luc de Montigny – MAPS – 2005 Random Labeling – Significance To assess the significance of the difference between Kcases(d ) and Kcontrols(d ), generate simulation envelopes: • pool the points (cases + controls); • randomly assign “case” status to ncase points; • and calculate the summary function; • repeat X number of times. • The maximum and minimum values for each distance bin are taken from all iterations of the simulation. Under the null hypothesis: • Kcases(d ) = Kcontrols(d ) = Krandom assignment(d ) Luc de Montigny – MAPS – 2005 Random Labeling – Results Non-flat curve* indicates difference between spatial distribution of cases from distribution of controls: clustering over and above that of environmental heterogeneity. Peaks outside the simulation envelope should be considered significant. K1: observations (cases) K2: non-observations (controls) ˆ (d )=Kcases(d ) - Kcontrols(d ) *D Luc de Montigny – MAPS – 2005 2) Constrained Opportunity Surface In many situations, events can occur in some spaces, but not others. • Problem: increased likelihood of type II error • Proposed Solution: constrain the opportunity surface to the area where events can be observed, i.e. explicitly define a spatial sample frame Luc de Montigny – MAPS – 2005 Delimiting the Sample Frame Where can syringes be found? • Alleys and sidewalks • Parks • Parking lots and vacant lots Sample frame ≈ 0.3 Study area Luc de Montigny – MAPS – 2005 Cluster Analysis using Kernel Density Estimates (KDE) KDE is a form of surface modeling – values are estimated for locations between data points. KDE can be extended to estimate the intensity of one type of point data relative to another. • In epidemiology, KDE are calculated for both events (cases) and for populations at risk (controls), to control for uneven distributions of population. • This approach can be adapted for use here: density of events (distribution of syringes) can be normalized by density of opportunity (distribution of sample frame). Luc de Montigny – MAPS – 2005 KDE – Syringe Points The “smoothed” surface represents the intensity of discarded syringes within the search radius, or bandwidth (100m) of any given location in the study area (i.e., for every grid cell). PVC lines represent the boundary of the area that contains 90% of the volume of a probability density distribution; on average 90% of the points that were used to generate the KDE are contained within the lines. Luc de Montigny – MAPS – 2005 KDE – Sample Frame Here the sample frame is converted to a grid (10m), and the centroid of each cell is used for the purposes of the kernel density estimation. Luc de Montigny – MAPS – 2005 KDE – Syringe/Sample Ratio The ratio surface represents, for each grid cell, the syringe point KDE value divided by the square of the sample frame KDE value. Luc de Montigny – MAPS – 2005 KDE – Comparison A comparison of how syringe points cluster in the study area (simple density estimate), to how those same points cluster within the sample space (the ratio between the two density estimates). These results suggest that the distribution (clustering) of syringes is due to factors other than the distribution of opportunity. Luc de Montigny – MAPS – 2005 Caveats and Limitations Random Labeling • Huge departure from envelope is due to mis-specifying the null hypothesis (only “proving” the obvious) – should use different null hypothesis. • K function assumes stationarity; probably violated in this case – should use inhomogenous function. Kernel Density Estimate ratios • The intensity of opportunity (sample space density estimates) is measured in an arbitrary way. The choice of grid resolution, and bandwidth size are influential to the density estimate, yet are not grounded in theory. • Density surfaces should be “clipped” to areas within the sample frame for the purposes of visualization and analysis. Luc de Montigny – MAPS – 2005 Summary • Most events studied in criminology are the result of point processes (point patterns). • Tools designed for the analysis of surfaces may not be appropriate for criminology. • Popular analytic techniques have underlying assumptions that are violated at the micro-scale. • Ignoring the above can result in erroneous results (type II error, model mis-specification) Contact information: [email protected] Luc de Montigny – MAPS – 2005 Acknowledgements This research would not be possible without the hard work and collaboration of Spectre de rue, Montréal. Luc de Montigny – MAPS – 2005 Selected Reading & Software References • Gatrell AC, Bailey TC, Diggle PJ, Rowlingson BS (1996) Spatial point pattern analysis and its application in geographic epidemiology. Transactions of the Institute of British Geographers 21: 256-274 • Walter C, McBratney AB, Viscarra Rossel RA, Markus JA (2005) Spatial point-process statistics: concepts and application to the analysis of lead contamination in urban soil. Environmetrics 16: 339-355 • Beyer HL (2005) Hawth's Analysis Tools for ArcGIS. Available at http://www.spatialecology.com/htools • Rowlingson BS, Diggle PJ, Bivand R (2005) The splancs package for R. Available at http://www.maths.lancs.ac.uk/~rowlings/Splancs • Baddeley A, Turner R (2005) The spatstat package for R. Available at http://www.maths.uwa.edu.au/~adrian/spatstat • Lewin-Koh NJ, Bivand R (2005) The maptools package for R. Available at http://cran.r-project.org Luc de Montigny – MAPS – 2005 Appendix – Ripley’s K Summary Function The K-function describes the degree to which there is spatial dependence in the arrangement of events K(d) = λ-1E[number of events within d from a randomly selected event] Where λ is the intensity, and E[] the expectation Formally: Where, • R is the region (extent) • I is a binary indicator function • w is the proportion of the search radius that falls within R Luc de Montigny – MAPS – 2005