Download CSE591 Data Mining

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
6. Spatial Mining
Spatial Data and Structures
Images
Spatial Mining Algorithms
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
1
Definitions
• Spatial data is about instances located in a
physical space
• Spatial data has location or geo-referenced
features
• Some of these features are:
– Address, latitude/longitude (explicit)
– Location-based partitions in databases (implicit)
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
2
Applications and Problems
• Geographic information systems (GIS) store
information related to geographic locations on Earth
– Weather, community infrastructure needs, disaster
management, and hazardous waste
• Homeland security issues such as prediction of
unexpected events and planning of evacuation
• Remote sensing and image classification
• Biomedical applications include medical imaging
and illness diagnosis
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
3
Use of Spatial Data
• Map overlay – merging disparate data
– Different views of the same area: (Level 1) streets, power
lines, phone lines, sewer lines, (Level 2) actual elevations,
building locations, and rivers
• Spatial selection – find all houses near WSU
• Spatial join – nearest for points, intersection for areas
• Other basic spatial operations
– Region/range query for objects intersecting a region
– Nearest neighbor query for objects closest to a given place
– Distance scan asking for objects within a certain radius
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
4
Spatial Data Structures
• Minimum bounding rectangles (MBR)
• Different tree structures
– Quad tree
– R-Tree
– kd-Tree
• Image databases
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
5
MBR
• Representing a spatial object by the smallest
rectangle [(x1,y1), (x2,y2)] or rectangles
(x2,y2)
(x1,y1)
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
6
Tree Structures
• Quad Tree: every four quadrants in one layer
forms a parent quadrant in an upper layer
– An example
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
7
R-Tree
• Indexing MBRs in a tree
– An R-tree of order m has at most m entries in one node
– An example (order of 3)
R6
R8
R8
R1
R6
R7
R7
R2
R3
7/03
R4
R5
R1
R2
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
R3
R4
R5
8
kd-Tree
• Indexing multi-dimensional data, one dimension
for a level in a tree
– An example
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
9
Common Tasks dealing with Spatial Data
• Data focusing
– Spatial queries
– Identifying interesting parts in spatial data
– Progress refinement can be applied in a tree structure
• Feature extraction
– Extracting important/relevant features for an
application
• Classification or others
– Using training data to create classifiers
– Many mining algorithms can be used
• Classification, clustering, associations
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
10
Spatial Mining Tasks
• Spatial classification
• Spatial clustering
• Spatial association rules
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
11
Spatial Classification
• Use spatial information at different (coarse/fine)
levels (different indexing trees) for data focusing
• Determine relevant spatial or non-spatial features
• Perform normal supervised learning algorithms
– e.g., Decision trees,
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
12
Spatial Clustering
•
•
•
•
7/03
Use tree structures to index spatial data
DBSCAN: R-tree
CLIQUE: Grid or Quad tree
Clustering with spatial constraints (obstacles 
need to adjust notion of distance)
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
13
Spatial Association Rules
• Spatial objects are of major interest, not transactions
• AB
– A, B can be either spatial or non-spatial (3 combinations)
– What is the fourth combination?
• Association rules can be found w.r.t. the 3 types
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
14
Summary
• Spatial data can contain both spatial and nonspatial features.
• When spatial information becomes dominant
interest, spatial data mining should be applied.
• Spatial data structures can facilitate spatial mining.
• Standard data mining algorithms can be modified
for spatial data mining, with a substantial part of
preprocessing to take into account of spatial
information.
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
15
Bibliography
• M. H. Dunham. Data Mining – Introductory and
Advanced Topics. Prentice Hall. 2003.
• R.O. Duda, P.E. Hart, D.G. Stork. Pattern
Classification, 2nd edition. Wiley-Interscience.
• J. Han and M. Kamber. Data Mining – Concepts
and Techniques. 2001. Morgan Kaufmann.
7/03
Spatial Data Mining
G Dong (WSU) & H. Liu (ASU)
16
Related documents