Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Spatial Query Processing for High Resolutions Hans-Peter Kriegel, Martin Pfeifle, Marco Pötke, Thomas Seidl Database Group Institute for Computer Science University of Munich, Germany Martin Pfeifle, Database Group, University of Munich 8th International DASFAA-Conference 26 - 28 March, 2003, Kyoto, Japan Kyoto, 03/26/03 Outline of the Talk 1.) Introduction 2.) RI-Tree 3.) HRI-Approach 4.) Experimental Evaluation 5.) Conclusions Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Outline of the Talk 1.) Introduction Spatial Databases Voxelized Objects 2.) RI-Tree High Resolutions 3.) HRI-Approach 4.) Experimental Evaluation 5.) Conclusions Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Spatial Database Management Systems System Requirements: box queries Effectivity collision queries Efficiency complex spatial objects Scalability Concurrency Control Recovery Spatial Database Management Systems (based on extensible ORDBMS) Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Voxelized Spatial Objects 1.) linearization of the data space – grid-approximation – space filling curve 2.) interval sequence – bottom-up or top-down – size-bound or error-bound triangulated objects voxel sequence interval sequence Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Query Processing for High Resolutions O R D B M S C A D Martin Pfeifle, Database Group, University of Munich High Resolution Spatial - DB Filter - Step e.g. RI-Tree HRI-Approach Candidate Set for Application Specific Refinement Step Refinement Step Result Kyoto, 03/26/03 Outline of the Talk 1.) Introduction Relational Interval Tree [VLDB 00] 2.) RI-Tree Extensible Indexing 3.) HRI-Approach 4.) Experimental Evaluation 5.) Conclusions Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Relational Interval Tree (RI-Tree) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 A B C D 8 4 1b 2 1 3 7b 6 5 3a 15ca 12 5c 15a 7 12 10 9 14 11 13 15 13d 13d Foundation: Interval Tree [Edelsbrunner 1980] Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Relational Interval Tree (RI-Tree) 1 2 3 4 5 6 7 8 9 10 11 12 13 root = 2h–1 14 15 D 8 4 1b 2 1 3 7b 6 5 3a 15ca 12 5c 15a 7 12 10 9 14 11 13 13d 13d 1 15 2h – 1 Foundation: Interval Tree [Edelsbrunner 1980] 1. Idea: Virtualization of the Primary Structure Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Relational Interval Tree (RI-Tree) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 A B C D root = 2h–1 8 4 1b 2 1 3 7b 6 5 3a 15ca 12 5c 15a 7 12 10 9 14 11 13 15 13d 13d Foundation: Interval Tree [Edelsbrunner 1980] 1. Idea: Virtualization of the Primary Structure 2. Idea: Managing of the Secondary Structure by 2 B+-trees node lower id 4 8 8 13 1 3 5 13 b a c d lowerIndex Martin Pfeifle, Database Group, University of Munich node upper id 4 8 8 13 7 12 15 13 b c a d upperIndex Kyoto, 03/26/03 RI-Tree: Intersection Query 1. Procedural Step 2. Declarative Step Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 RI-Tree: Intersection Query 1. Procedural Step query Q 22 = lower upper = 25 arithmetic traversal of the primary structure collecting the visited nodes in transient tables number of I/0-accesses: 0 16 = root 24 = fork 20 28 26 22 23 25 16 20 select id from upperIndex i, :leftNodes left where i.node = left.node and i.upper >= :Q.lower union all select id from lowerIndex i, :rightNodes right where i.node = right.node and i.lower <= :Q.upper union all select id from upperIndex i where i.node between :Q.lower and :Q.upper Martin Pfeifle, Database Group, University of Munich 28 26 22-25 2. Declarative Step – posting one single SQL-statement – number of I/O-accesses: O(h·logbn + r/b) Kyoto, 03/26/03 RI-Tree: Integration into an ORDBMS Declarative Embedding Object-relational DML and DDL Extensible Indexing Framework Extensible Optimization Framework Object-relational interface for index maintenance and querying functions. Object-relational interface for selectivity estimation and cost prediction functions. User-defined Index Structure [VLDB 00] [SSTD 01] User-defined Cost Model [SSDBM 02] Relational Implementation Relational Implementation Mapping to built-in indexes (B+-trees); SQL-based query processing Mapping to built-in statistics facilities; SQL-based evaluation of cost model Physical Implementation Block-Manager, Caches, Locking, Logging, … Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Outline of the Talk 1.) Introduction Grey Intervals 2.) RI-Tree Storage of Grey Intervals in an ORDBMS 3.) HRI-Approach 4.) Experimental Evaluation Intersection Queries based on Grey Intervals 5.) Conclusions Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Grey Intervals A voxelized “real-world” object Black object interval sequence (obtained from encoding voxels via a space filling curve) 0 8 16 24 32 40 48 Grey object interval sequence (obtained from grouping black intervals together) Ogrey = ( id, (2, 5), (7, 7) 7), (12, 12), (15, 15),(18, 18), (22, 29), (36, 36), (39, 39),(42, 42), (46, 47)) Grey Interval Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Storage of a Grey Object Interval Sequence table schema GreyIntervals (id, intervalsequence) BLOB aggregated information H(Igrey), D(Igrey), G(Igrey) + L 2 (n 1) log2 L - bit-oriented approach offset-oriented approach ... ... 1 0 1 0 1 ... ... w 1 1 1 0 1 0 1 1 O (L) Martin Pfeifle, Database Group, University of Munich w2 w w 1w 2w 3w 4 3 w 4 O (n * log L) Kyoto, 03/26/03 Multi-Step Query Processing for Intersection Queries result set DB A C B A1 A2 A3 B1 B2 B3 C1 C2 C3 C4 D1 D2 ... A B C D A11 Q11 A A3 Q Q A 3 22 A22 Q11 A E1 E Interval Index + 1. filter step Q1 Q2 Query Q B3 Q1 B2 Q1 C C11 Q1 C Q21 Q1 C2 Q1 C2 Q D12 Q1 C3 Q2 D1 Martin Pfeifle, Database Group, University Q1of Munich + FAST GREY TEST A1 Q1 A3 Q2 ? 2. filter step B3 Q1 B2 Q1 C1 Q1 BLOB TEST + 3. filter step Kyoto, 03/26/03 2. filter step: FAST-GREY-TEST intersection test based on the aggregated information of the grey intervals. ... ... black interval + black interval intersection black interval + grey interval black interval covers the starting or end point of the grey interval long black interval + grey interval maximum gap of the grey interval smaller than the length of the black interval grey interval + grey interval share the same starting or end point grey interval + grey interval no intersection L number of white cells is smaller than the intersecting area grey interval + other interval grey interval has only two black cells and the other interval is completely included in this interval grey interval + grey interval grey intervals have only two black cells and the intervals have no common starting or end point Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 3. filter step: BLOB-TEST intersection test based on the examination of the black interval sequence oL ... A L ... I1 I2 runtime analysis bit-oriented approach finding the starting point A of the interlacing area O (1) testing the L voxels O (L) Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 3. filter step: BLOB-TEST intersection test based on the examination of the black interval sequence A ... w77 w1 w2 w3 w44 w5 w w66 w L w8 ... I1 n1 = 8 nL1 = 1 nL2 = 3 I2 runtime analysis bit-oriented approach offset-oriented approach finding the starting point A of the interlacing area O (1) finding the starting point of the interlacing area O (log n1) testing the L voxels testing the nL1 resp. nL2 intervals O (L) Martin Pfeifle, Database Group, University of Munich O (nL1+nL2) Kyoto, 03/26/03 SQL-Statement SELECT candidates.id FROM ( SELECT db.id AS id, table (AggInfos(db.intervalsequence, q.intervalsequence)) AS ctable FROM GreyIntervals db, :GreyQueryIntervals q WHERE intersects (hull(db.intervalsequence), hull(q.intervalsequence)) GROUP BY db.id ) candidates WHERE EXISTS ( SELECT 1 FROM GreyIntervals db, :GreyQueryIntervals q, candidates.ctable ctable WHERE db.rowid = ctable.dbrowid AND q.rowid = ctable.qrowid AND blobintersection (db.intervalsequence, q.intervalsequence) ) Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Outline of the Talk 1.) Introduction 2.) RI-Tree 3.) HRI-Approach 4.) Experimental Evaluation 5.) Conclusions Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Experimental Evaluation CAR approx. 200 parts approx. 7 million intervals resolution: 33 bit (0 .. 8.589.934.591) PLANE approx. 10.000 parts approx. 9 million intervals resolution: 42 bit (0 .. 4.398.046.511.103) Examination of the HRI approach based on different MAXGAP Parameters: 10 100 1,000 10,000 100,000 1,000,000 Comparison between the HRI approach and the spatial variant of the RI-tree [SSTD 01] Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Experiments - Interval Distribution 10000000 M=10^1 1000000 M=10^2 100000 M=10^3 10000 M=10^4 1000 M=10^5 100 M=10^6 10 1 26 21 44 20 97 15 2 16 77 72 16 40 96 32 76 8 51 2 64 Interval-Length 8 1 number of intervals CAR M=0 (RI-Tree) number of interval decreases with increasing MAXGAP - parameter average interval length increases with increasing MAXGAP - parameter Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 number of blocks [x1000] Experiments – Secondary Storage 160 140 120 100 80 60 40 20 0 PLANE indexes table 0 MAXGAP 1000000 1000 (RI-Tree) With the HRI method we can improve the storage requirement by an order of magnitude. Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 time [sec] CAR Experiments – Runtime for collision queries 3.FilterStep 2.FilterStep 50 40 30 20 10 0 1.FilterStep 10 100 time [sec] PLANE 0.8 1000 10000 100000 Maxgap 1000000 3.FilterStep 0.6 2.FilterStep 0.4 1.FilterStep 0.2 0.0 10 100 1000 10000 Martin Pfeifle, Database Group, University of Munich 100000 Maxgap 1000000 Kyoto, 03/26/03 time [sec] CAR Experiments – Runtime for collision queries 3.FilterStep 2.FilterStep 50 40 30 20 10 0 1.FilterStep 10 100 time [sec] PLANE 0.8 1000 10000 100000 Maxgap 1000000 RI-tree RI-tree: 316.5 s HRI: 2.2 s (Maxgap=10,000) 3.FilterStep 0.6 2.FilterStep 0.4 1.FilterStep huge part (PLANE) 0.2 0.0 10 100 1000 10000 Martin Pfeifle, Database Group, University of Munich 100000 Maxgap 1000000 RI-tree Kyoto, 03/26/03 Outline of the Talk 1.) Introduction 2.) RI-Tree 3.) HRI-Approach 4.) Experimental Evaluation 5.) Conclusions Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 Conclusions What is the HRI approach? the HRI approach is a multi-step index structure suitable for spatial query processing for high resolutions What are the advantages of the HRI approach? good secondary storage utilization small main memory footprint improved query response time behaviour Martin Pfeifle, Database Group, University of Munich Kyoto, 03/26/03 ? ? ? ? Any questions? ? Martin Pfeifle, Database Group, University of Munich ? ? Kyoto, 03/26/03