Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 1 Definitions • Spatial data is about instances located in a physical space • Spatial data has location or geo-referenced features • Some of these features are: – Address, latitude/longitude (explicit) – Location-based partitions in databases (implicit) 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 2 Applications and Problems • Geographic information systems (GIS) store information related to geographic locations on Earth – Weather, community infrastructure needs, disaster management, and hazardous waste • Homeland security issues such as prediction of unexpected events and planning of evacuation • Remote sensing and image classification • Biomedical applications include medical imaging and illness diagnosis 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 3 Use of Spatial Data • Map overlay – merging disparate data – Different views of the same area: (Level 1) streets, power lines, phone lines, sewer lines, (Level 2) actual elevations, building locations, and rivers • Spatial selection – find all houses near WSU • Spatial join – nearest for points, intersection for areas • Other basic spatial operations – Region/range query for objects intersecting a region – Nearest neighbor query for objects closest to a given place – Distance scan asking for objects within a certain radius 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 4 Spatial Data Structures • Minimum bounding rectangles (MBR) • Different tree structures – Quad tree – R-Tree – kd-Tree • Image databases 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 5 MBR • Representing a spatial object by the smallest rectangle [(x1,y1), (x2,y2)] or rectangles (x2,y2) (x1,y1) 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 6 Tree Structures • Quad Tree: every four quadrants in one layer forms a parent quadrant in an upper layer – An example 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 7 R-Tree • Indexing MBRs in a tree – An R-tree of order m has at most m entries in one node – An example (order of 3) R6 R8 R8 R1 R6 R7 R7 R2 R3 7/03 R4 R5 R1 R2 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) R3 R4 R5 8 kd-Tree • Indexing multi-dimensional data, one dimension for a level in a tree – An example 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 9 Common Tasks dealing with Spatial Data • Data focusing – Spatial queries – Identifying interesting parts in spatial data – Progress refinement can be applied in a tree structure • Feature extraction – Extracting important/relevant features for an application • Classification or others – Using training data to create classifiers – Many mining algorithms can be used • Classification, clustering, associations 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 10 Spatial Mining Tasks • Spatial classification • Spatial clustering • Spatial association rules 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 11 Spatial Classification • Use spatial information at different (coarse/fine) levels (different indexing trees) for data focusing • Determine relevant spatial or non-spatial features • Perform normal supervised learning algorithms – e.g., Decision trees, 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 12 Spatial Clustering • • • • 7/03 Use tree structures to index spatial data DBSCAN: R-tree CLIQUE: Grid or Quad tree Clustering with spatial constraints (obstacles need to adjust notion of distance) Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 13 Spatial Association Rules • Spatial objects are of major interest, not transactions • AB – A, B can be either spatial or non-spatial (3 combinations) – What is the fourth combination? • Association rules can be found w.r.t. the 3 types 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 14 Summary • Spatial data can contain both spatial and nonspatial features. • When spatial information becomes dominant interest, spatial data mining should be applied. • Spatial data structures can facilitate spatial mining. • Standard data mining algorithms can be modified for spatial data mining, with a substantial part of preprocessing to take into account of spatial information. 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 15 Bibliography • M. H. Dunham. Data Mining – Introductory and Advanced Topics. Prentice Hall. 2003. • R.O. Duda, P.E. Hart, D.G. Stork. Pattern Classification, 2nd edition. Wiley-Interscience. • J. Han and M. Kamber. Data Mining – Concepts and Techniques. 2001. Morgan Kaufmann. 7/03 Spatial Data Mining G Dong (WSU) & H. Liu (ASU) 16