Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Predicting Locations Using Map
Similarity(PLUMS): A Framework for
Spatial Data Mining
Sanjay Chawla(Vignette Corporation)
Shashi Shekhar, Weili Wu(CS, Univ. of Minnesota)
Uygar Ozesmi(Ericyes University, Turkey)
http://www.cs.umn.edu/research/shashi-group
Outline
• Motivation
• Application Domain
• Distinguishing characteristics of spatial data
mining
• Problem Definition
• Spatial Statistics Approach
• Our approach: PLUMS
• Experiments, Results, Conclusion and Future
Work
Motivation
• Historical Examples of Spatial Data
Exploration
– Asiatic Cholera, 1855
– Theory of Gondwanaland
– Effect of fluoride on Dental Hygiene
• A potential application in news
– Tracking the West Nile Virus
Application Domain
• Wetland Management: Predicting locations
of bird(red-winged blackbird) nests in
wetlands
• Why we choose this application ?
– Strong spatial component
– Domain Expertise
– Classical Data Mining techniques(logistic
regression, neural nets) had already been applied
Application Domain: Continued..
Nest Locations
Vegetation Durability
Distance to open water
Water Depth
Unique characteristics of spatial
data mining
Spatial Autocorrelation Property
Unique characteristics…cont
Average Distance to Nearest Prediction(ADNP):
1
ADNP( A, P) 
K
K
 d ( A , A .nearest ( P))
k 1
k
k
Location Prediction:Problem Formulation
•
Given: A spatial framework S.
f X k : S  Rk
– Explanatory functions,
– Dependent function
fY : S  RY  {0,1}
– A family F of learning model function mappings
• Find an element
fˆy  F : Rk  ....  Rk  Ry
• Objective: maximize (map_similarity = classification_accuracy +
spatial accuracy)
• Constraints: spatial autocorrelation exists
Spatial Statistics Approach
1.
2.
2”
y  X  
 y  Wy  X  
Logistic Regression:
e X
Pr ob( y  1) 
1  e X
Spatial Stat: Solution Techniques
• Least Square Estimation: Biased and
Inconsistent
• Maximum Likelihood: Involve
computation of large determinant(from W)
• Bayesian: Monte Carlo Markov Chain(e.g.
Gibbs Sampling)
Our Approach
Experiment Setup
Result(1)
TPR 
TP
TP  FN
FPR 
FP
FP  TN
Result(2)
Conclusion and Future work
• PLUMS >> Classical Data Mining
techniques
• PLUMS  State-of-the-art Spatial
Statistics approaches
• Better performance(two orders of
magnitude)
• Try other configurations of the PLUMS
framework and formalize!