Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Spatial and Spatio-Temporal Data Models for GIS Bart Kuijpers Limburgs Universitair Centrum http://alpha.luc.ac.be/~lucp1265/ [email protected] Overview • Spatial data models in GIS – layers – raster model, vector model – specific models • Spatial database systems • Spatio-temporal data models – challenges and problems What is the S in GIS? • 1980s: Geographic Information Systems – technology for the acquisition and management of spatial information – software for professional users, e.g. cartographers – Example: ESRI Arc/View software • 1990s: Geographic Information Science – comprehending the underlying conceptual issues of representing data and processes in space-time – the science (or theory and concepts) behind the technology – Example: design spatial data types and operations for querying • 1990s: Geographic Information Studies – understanding the social, legal and ethical issues associated with the application of GISy and GISc • 2000s: Geographic Information Services – Web-sites and service centers for casual users, e.g. travelers – Service (e.g., GPS, mapquest) for route planning GIS -- What is it? No easy answer anymore! • Geographic Information – information about places on the earth’s surface – knowledge about “what is where when” (Don’t forget time!) • Geographic Information Technologies – technologies for dealing with this information • Global Positioning Systems (GPS) • Remote Sensing (RM) • Geographic Information Systems (GIS) Examples of GIS data • • • Urban Planning, Management – Land acquisition – Economic development – Housing renovation programs – Emergency response – Crime analysis Environmental Sciences – Monitoring environmental risk – Modeling stormwater runoff – Management of watersheds, floodplains, wetlands, forests – Environmental Impact Analysis – Hazardous or toxic facility siting Political Science – Analysis of election results – Predictive modeling • • • • • Civil Engineering/Utility – Locating underground facilities – Coordination of infrastructure maintenance Business – Demographic Analysis – Market Penetration/ Share Analysis Education Administration – Enrollment Projections – School Bus Routing Real Estate – Neighborhood land prices – Traffic Impact Analysis Health Care – Epidemiology – Service Inventory Older definitions of GISy • The common ground between information processing and the many fields using spatial analysis techniques. (Tomlinson, 1972) • A powerful set of tools for collecting, storing, retrieving, transforming, and displaying spatial data from the real world. (Burroughs, 1986) • A computerised database management system for the capture, storage, retrieval, analysis and display of spatial (locationally defined) data. (NCGIA, 1987) • A decision support system involving the integration of spatially referenced data in a problem solving environment. (Cowen, 1988) Definitions of GIS “A system of hardware, software, and procedures designed to support the • capture, • management, • manipulation, • analysis, • modeling and • display of spatially-referenced data (located on the earth’s surface) for solving complex planning and management problems.” GIS Data Models Intuitive: a map with a database behind it. Purpose: allows the geographic features in real world locations to be digitally represented and stored in a database so that they can be abstractly presented in map form, and can also be worked with and manipulated to address some problem. GIS Data Model based on data layers or themes Examples of layers or themes • Data is organized by layers, coverages or themes, with each theme representing a common feature. • Layers are integrated using explicit location on the earth’s surface, thus geographic location is the organizing principal. Digital Elevation Models Watersheds Streams Waterbodies An integrated view • Layers are integrated using explicit location on the earth’s surface, thus geographic location is the organizing principal. Example of layers or themes roads Here we have three layers or themes: - roads, - hydrology (water), - topography (land elevation) hydrology They can be related because precise geographic coordinates are recorded for each theme. topography How are layers described? •Layers are comprised of two data types: • spatial data which describes location (where) stored in a shape file in ArcView • attribute data specifing what, how much, when stored in a database table GIS systems traditionally maintain spatial and attribute data separately, then “join” them for display or analysis (for example, in ArcView) roads hydrology topography How are layers described? The spatial component of a layer may be represented in two ways: • in raster (image) format as pixels •in vector format as points and lines and areas (PLA-model) 1. Raster Model • area is covered by grid with (usually) equal- sized cells • cells often called pixels (picture elements); raster data often called image data • attributes are recorded by assigning each cell a single value based on the majority feature (attribute) in the cell, such as land use type wheat fruit fruit oats clover corn 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 1 1 1 1 1 4 4 5 5 5 1 1 1 1 1 4 4 5 5 5 1 1 1 1 1 4 4 5 5 5 1 1 1 1 1 4 4 5 5 5 1 1 1 1 1 4 4 5 5 5 2 2 2 2 2 2 2 3 3 3 2 2 2 2 2 2 2 3 3 3 2 2 2 2 2 2 2 3 3 3 2 2 4 4 2 2 2 3 3 3 2 2 4 4 2 2 2 3 3 3 Raster Model: data structures Runlength Compression (for single layer) Full Matrix-162 bytes 111111122222222223 111111122222222233 111111122222222333 111111222222223333 111113333333333333 111113333333333333 111113333333333333 111333333333333333 111333333333333333 Run Length (row)-44 bytes 1,7,2,17,3,18 1,7,2,16,3,18 1,7,2,15,3,18 1,6,2,14,3,18 1,5,3,18 1,5,3,18 1,5,3,18 1,3,3,18 1,3,3,18 Raster Model: data structures Quad-Tree Representation (for single layer) Raster Model: data structures Quad-Tree Representation (for single layer) 1 3 0 2 Raster Model: data structures Quad-Tree Representation (for single layer) Raster Model: data structures Quad-Tree Representation (for single layer) Raster Model: data structures Quad-Tree Representation (for single layer) 1 0 3 2 5 7 13 15 4 6 12 14 1 3 9 11 0 2 8 10 Raster Model: data structures Quad-Tree Representation (for single layer) Raster Model: data structures Quad-Tree Representation (for single layer) Raster Model: data structures Quad-Tree Representation (for single layer) 5 7 13 15 4 6 12 14 1 3 9 11 0 2 8 10 Raster Model: data structures Quad-Tree Representation (for single layer) 5 7 4 6 0 12 5 7 13 15 4 6 12 14 9 11 1 3 9 11 8 10 0 2 8 10 Raster Model: data structures Quad-Tree Representation (for single layer) 5 7 4 6 0 [0,2] white 12 9 11 8 10 [12,2] blue [4,1] [5,1] [6,1] [7,1] [8,1] [9,1] [10,1] [11,1] blue white green red red white white green Raster Model Raster data are good at representing continuous phenomena, e.g., •Wind speed •Elevation, slope, aspect •Chemical concentration •Likelihood of existence of a certain species •Electromagnetic reflectance (photographic or satellite imagery) Raster Model • much data comes in this form •images from remote sensing (LANDSAT, SPOT) •scanned maps • digital orthophoto Raster Model • best for continuous features: •elevation •temperature •soil type •land use • digital elevation model (DEM) Raster Model: Pros and Cons [+] Continuous (surface) data represented easily [+] Simple data structure, fast indexing [–] Shape of discrete polygonal features generalized by cells [–] Intersection of two lines Raster Model: tesselations •Square grid: equal length sides –4-connected neighborhood (rook’s case) •all neighboring cells are equidistant –8-connected neighborhood (queen’s case) •all neighboring cells not equidistant •rectangular •triangular (3-sided) and hexagonal (6-sided) –all adjacent cells and points are equidistant •triangulated irregular network (TIN): –vector model used to represent continuous surfaces (elevation) –more later under vector 2. Vector Model The fundamental concept of vector GIS is that all geographic features in the real work can be represented either as: • points or dots (nodes): trees, poles, fire plugs, airports, cities • lines (arcs): streams, streets, sewers, • areas (polygons): land parcels, cities, counties, forest, rock type Example: Because representation depends on shape, ArcView refers to files containing vector data as shapefiles Vector Model Points: represent discrete point features each point location has a record in the table airports are point features each point is stored as a coordinate pair Vector Model Lines: represent linear features each road segment has a record in the table roads are linear features Vector Model Lines: fundamental spatial data model node vertex node vertex vertex vertex • Lines start and end at nodes • line #1 goes from node #2 to node #1 • Vertices determine shape of line • Nodes and vertices are stored as coordinate pairs Vector Model Polygons: represent bounded areas each bounded Polygon has a record in the table landforms and water are polygonal features Vector Model Polygons : fundamental spatial data model • Polygon #2 is bounded by lines 1 & 2 • Line 2 has polygon 1 on left and polygon 2 on right Vector Model Polygons: fundamental spatial data model • complex data model, especially for larger data sets • “arc-node topology,” only used for ArcInfo data sets Vector Model Shapefile polygon spatial data model • less complex data model • polygons do not share bounding lines Vector Model ArcInfo coverage spatial data model • Commonly found format (due to ArcInfo market dominance) • Coordinate data not editable in ArcView (but tabular data are editable in ArcView) • polygons share bounding lines Vector Model: illustration of polygon Data File A34 A44 A42 A32 A34 B44 B54 B52 B42 B44 C 32 C42 C40 5 4 3 E A B C D 2 1 0 1 2 3 4 5 C30 C32 D42 D52 D50 D40 D42 E15 E55 E54 E34 E30 E10 E15 Vector Model: illustration of point & polygon Points File 5 12 11 2 1 4 3 E 2 1 1 B 3 4 C 10 0 A 9 2 3 5 8 4 6 D 7 5 1 2 3 4 5 6 7 8 9 10 11 12 34 44 42 32 54 52 50 40 30 10 15 55 Polygons File A 1, 2, 3, 4, 1 B 2, 5, 6, 3, 2 C 4, 3, 8, 9, 4 D 3, 6, 7, 8, 3 E 11, 12, 7, 10, 11 1 I 4 II Smith Estate A34 IV 2 Birch Node/Arc/ Polygon and Attribute Data III A35 3 Relational Representation: DBMS required! Cherry Spatial Data Node Table Node ID Easting Northing 1 126.5 578.1 2 218.6 581.9 3 224.2 470.4 4 129.1 471.9 Arc Table Arc ID From N To N L Poly I 4 1 II 1 2 III 2 3 A35 IV 3 4 Polygon Table Polygon ID Arc List A34 I, II, III, IV A35 III, VI, VII, XI R Poly A34 A34 A34 A34 Attribute Data Node Feature Attribute Table Node ID Control Crosswalk 1 light yes 2 stop no 3 yield no 4 none yes ADA? yes no no no Arc Feature Attribute Table Arc ID Length Condition Lanes Name I 106 good 4 II 92 poor 4 Birch III 111 fair 2 IV 95 fair 2 Cherry Polygon Feature AttributeTable Polygon ID Owner Address A34 J. Smith 500 Birch A35 R. White 200 Main Raster versus Vector Model “raster is faster but vector is corrector” (Joseph Berry) From: Burrough, Peter A. and Rachael A. McDonnell. (1998). Principles of Geographic Information Systems. p 27. Raster versus Vector Model “raster is faster but vector is corrector” (Joseph Berry) • Raster data model – location is referenced by a grid cell in a rectangular array (matrix) – attribute is represented as a single value for that cell – much data comes in this form • images from remote sensing (LANDSAT, SPOT) • scanned maps • elevation data from USGS – best for continuous features: • • • • elevation temperature soil type land use • Vector data model – location referenced by x,y coordinates, which can be linked to form lines and polygons – attributes referenced through unique ID number to tables – much data comes in this form • DIME and TIGER files from US Census • DLG from USGS for streams, roads, etc • census data (tabular) – best for features with discrete boundaries • property lines • political boundaries • transportation Variety of Vector Models Spaghetti model Topological model (most common) Triangulated Dime irregular network (TIN) files and TIGER files Network Digital model Line Graph (DLG) Shapefile Others: (ArcView/ArcGIS; ESRI) HPGL, PostScript/ASCII, CAD/.dxf Vector Model: Spaghetti information in n-dim. By mdim. hyperspaces (m<n) POINT;P_LINE; POLYGON COMPLEX_OBJECT Very efficient algorithms to detect properties Recursive holes Numerous query langauges Source: Lakhan, V. Chris. (1996). Introductory Geographical Information Systems. p. 54. Vector Model: Topological Bernhardsen, Tor. (1999). 2nd Ed. Geographic Information Systems: An Introduction. p. 62. fig. 4.12. Topological Data Model The topological data model is used, e.g., by the Census Bureau of the US: four relations R1: every line has two endpoints R2: every line has two areas R3: every area is surrounded by lines R4: every point is surrounded by areas and lines Vector Model: Topological connections & relationships between objects are independent of their coordinates overcomes major weakness of spaghetti model – allowing for GIS analysis (Overlaying, Network, Contiguity, Connectivity) topological invariants Topological Data Model A database in the topological data model consists of a finite number of labeled points labeled lines labeled areas p D r in R2 I H v A C t F E s B J q Topological Data Model The topological data model is used, e.g., by the Census Bureau of the US: four relations R1: every line has two endpoints R2: every line has two areas R3: every area is surrounded by lines R4: every point is surrounded by areas and lines p D r I H v A C t F E s B J q Topological Data Model The topological data model is used, e.g., by the Census Bureau of the US: four relations R1: every line has two endpoints R2: every line has two areas R3: every area is surrounded by lines R4: every point is surrounded by areas and lines not lossless p D r I H v A C t F E s B J q Topological Data Model Give for each labeled point the circular list of all lines and areas that appear clockwise around it: List(p)=( H D I) List(q)=( J F H) … lossless List(s)=( C B E) p = unbounded area r D I H v A C t F E s B J q Vector Model: Dime files and TIGER files Image Source: Demers, Michael. N. (2000). 2nd Ed. Fundamentals of Geographic Information Systems. p. 113. fig 4.16. Vector Model: Dime files and TIGER files Image Source: Clarke, Keith C. (2001). 3rd Ed. Getting Started with Geographic Information Systems. p 92. Vector Model: DLGs Digital Line Graphs Image Source: Clarke, Keith C. (2001). 3rd Ed. Getting Started with Geographic Information Systems. p. 90 Vector Model: Network Source: Heywood, Ian and Sarah Cornelius and Steve Carver. An Introduction to Geographical Information Systems. p. 60. fi Vector Model: Topological data adding semantics What kind of binary topological relations exist between spatial objects? Vector Model: Topological adding semantics What kind of binary topological relations exist between spatial objects? Based on region’s interior and boundary (and exterior) Determines whether interior and boundary have (non-)empty interesection Ao A Vector Model: Topological adding semantics What kind of binary topological relations exist between spatial objects? Egenhofer and Randell Cohn C… o A A Bo B o B Ao B A Vector Model: Topological disjoint meet contains inside equal covers coveredBy overlap Topological relationships Disjoint Point/Point Line/Line Polygon/Polygon Topological relationships Touches Point/Line Line/Polygon Point/Polygon Polygon/Polygon Line/Line Topological relationships Crosses Point/Line Point/Polygon Line/Line Line/Polygon Topological relationships Overlap Point/Point Line/Line Polygon/Polygon Topological relationships Within/contains Point/Point Line/Line Point/Line Line/Polygon Point/Polygon Polygon/Polygon Topological relationships Equals Point/Point Line/Line Polygon/Polygon Topological relationships line-line relationships area-area relationships adjacency island touch branching off cross intersect area-line relationships line in an area line ends at an area line ends in an area point-line relationships point on line point beside a line line is border of an area line intersects area line touches area point-area relationships point in area point on border of an area TIN: Triangulated Irregular Network Surface Polygons Points Node # 1 2 3 etc X 0 525 631 Y 999 1437 886 Polygon Node #s Topology A 1,2,4 B,D B 2,3,4 A,E,C C 3,4,5 B,F,G D 1,4,6 A,H etc Z 1456 1437 1423 Elevation points (nodes) chosen based on relief complexity, and then their 3-D location (x,y,z) determined. Elevation points connected to form a set of triangular polygons; these then represented in a vector structure. 2 1 A D 6 H4 E B 3 C F G 5 Attribute Info. Database Polygons A B C D etc. Var 1 1473 1490 1533 1486 Var 2 15 100 150 270 Attribute data associated via relational DBMS (e.g. slope, aspect, soils, etc.) Vector Model: Shapefile (ArcGIS; ESRI) This table represents examples of the shape types of geographic features in a data set for a shapefile Demers, Michael. N. (2000). 2nd Ed. Fundamentals of Geographic Information Systems. p. 114. fig 4.17. Vector Model: Others … HPGL, CAD/.dxf PostScript/ASCII Source: Clarke, Keith C. (2001). 3rd Ed. Getting Started with Geographic Information Systems. p. 89. fig. 3.12. Difference between GIS and Spatial Databases • GIS is a software to visualize and analyze spatial data using spatial analysis functions such as – Search Thematic search, search by region, classification – Location analysis Buffer, corridor, overlay – Terrain analysis Slope/aspect, drainage network – Flow analysis Connectivity, shortest path – Distribution Change detection, proximity, nearest neighbor – Spatial analysis/Statistics Pattern, centrality, autocorrelation, indices of similarity, topology: hole description – Measurements Distance, perimeter, shape, adjacency, direction • GIS uses SDBMS – to store, search, query, share large spatial data sets Difference between GIS and Spatial Databases • SDBMS focusses on – Efficient storage, querying, sharing of large spatial datasets – Provides simpler set based query operations – Example operations: search by region, overlay, nearest neighbor, distance, adjacency, perimeter etc. – Uses spatial indices and query optimization to speedup queries over large spatial datasets. • SDBMS may be used by applications other than GIS – Astronomy, Genomics, Multimedia information systems, ... • Will one use a GIS or a SDBM to answer the following: – How many neighboring countries does USA have? – Which country has highest number of neighbors? What is a Spatial DBMS? • A SDBMS is a software module that – can work with an underlying DBMS – supports spatial data models, spatial abstract data types (ADTs) and a query language from which these ADTs are callable – supports spatial indexing, efficient algorithms for processing spatial operations, and domain specific rules for query optimization • Example: Oracle Spatial data cartridge – can work with Oracle 8i DBMS – Has spatial data types (e.g. polygon), operations (e.g. overlap) callable from SQL3 query language – Has spatial indices, e.g. R-trees Spatial database systems for GIS Limitations of Relational Data Model • Values are atomic, complex objects need to be unnested (first normal form) • Only atomic types, no subtyping/inheritance, no encapsulation of operations with data, no OID • No support for unstructured/heterogeneous data • No support for infinite relations (spatial and spatiotemporal data) SSNr Name Salary 12345 Bart 50K 12346 Sofie 35K 12347 Bill 500M Spatial Data in the Relational Model • Boundary representation • No notion of spatial object/type • Mismatch with query language, e.g. SQL (no arithmetic,…) Name triangle1 triangle1 triangle1 (0,1) (0,0) (1,0) x 0 0 1 y 1 0 0 Extension of the Relational Model: ADT-approach • Relational model is augmented with ad hoc spatial data types: point, polyline, polygon • SQL is also extended • [Güting et al., 1990s] Extension of the Relational Model: ADT-approach • Example: ADT Polygon: – Constructors – Methods: containment, overlap, … – Subtyping/Inheritance: Rectangle isa Polygon • Query language: SQL • [Güting et al., 1990s] Beyond Relational: Constraint Databases [Kanellakis, Kuper and Revesz, PODS1990] • Constraint tuple: finite combination of atomic constraints • Constraint relation: finite set of generalized tuples • Semantics: infinite point sets Id Color geometry T1 T2 C1 green red yellow 0 x 0 y 1 x+y x 0 0 y 1 y-x (x-1)2+(y -1)2 = 1/9 … Queries in constraint databases • In practice only linear constraints [DEDALE, DISCO, COSMOS] • Natural query languages: FO+Poly, FO+While, FO+TC, Datalog with polynomial constraints • Q(S) = {(x,y) R2| • S(x,y)(( >0)(x’)(y’)((x-x’) 2+(y-y’) 2< 2 S(x’,y’)))} Q 0 x 0 y 1 x+y 0 x 1 0 y 1 (x = 0 y = 0 x+y = 1) Query evaluation in constraint Databases • Apply Q(S) = {(x,y) R2| S(x,y)(( >0)(x’)(y’)((x-x’) 2+ (y-y’) 2< 2 S(x’,y’)))} on A given by 0 x 0 y 1 x+y • Plugin: 0 x 0 y 1 x+y (( >0)(x’)(y’)((x-x’) 2+ (y-y’) 2< 2 0 x’ 0 y’ 1 x’+y’ ))} • Complexity is huge • Tarski (1930) • Collins’ CAD (1975): doubly exponential in #quantifiers • 1990s: single exponential in #quantifier-alternations 0 x 0 y 1 x+y 0 x 1 0 y 1 (x = 0 y = 0 x+y = 1) From Spatial Data to Spatio-Temporal Data • Temporal databases – 1980s, one of the first special purpose database applications • Spatial databases – well-studied area in GIS, – in database theory during 1990s • … • Spatio-temporal data: – studied in databases since mid 1990s, – SSD’99 changes into SSTD’01 – has its specific problems Spatio-Temporal Data Examples • Transportation: truck or ship movement, airplane flights • Meteorological: isobaric curves, temperature • Climate: season, vegetation changes • Natural disasters: forest fires, oil spills • Ecology: species migration, vegetation changes, habitat and land cover changes • Society and economy: urban growth, land use changes, epidemics • Ownership or administrative changes Spatio-Temporal Data What is changing where and how? What? • 0D points • 1D lines • 2D regions • 3D volumes Where? • on 1D line • in 2D plane • in 3D space How? • continuous evolution • continuous movement • discrete evolution • birth, death, split, merge Spatio-Temporal Data What is changing where and how? What? • 0D points • 1D lines • 2D regions • 3D volumes Where? • on 1D line • in 2D plane • in 3D space How? • continuous evolution • continuous movement • discrete evolution • birth, death, split, merge In combination with classical alpha-numerical data. Abstract Spatio-Temporal Objects • Time is isomorphic to the reals R • n-dimensional abstract spatio-temporal object O Rn R • It should satisfy – slice regularity, i.e., Ot0 ={(x1,…,xn) | (x1,…,xn,t0) O} should be a familiar object (point, square, triangle, polygon) – f(t)=Ot should be piece-wise continuous – closure properties Closure Properties • A class of spatio-temporal objects C is closed under an operation op (of arity k) if for all O1,…, Ok C, also op(O1,…, Ok ) C • Relevant operations: – intersection, union, difference – temporal/spatial selection – temporal/spatial projection • Closure properties are easy to obtain for spatial objects, more difficult for natural classes of spatiotemporal objects The ADT-approach [Güting et al., TODS 2000] • Spatial objects: points, polylines, polygons • Operations: – Set theoretic: intersection, union, difference – Aggregation: min, max, avg, center, area, volume – Metrical: distance – Topological: containment, adjacency, … The ADT-approach: Temporal Lifting • Temporal lifting of spatial type with domain D : () consist of partial functions R D E.g., (points), (polylines), (polygons) • Temporal lifting of spatial operation op: 1… k 0 op: (1) … (k) (0) • E.g. : polygons polygons polygons : (polygons) (polygons) (polygons) (p1, p2) {(t, p1(t) p2(t))| tR} Example of SQL3 Query • “Find all pairs of airplanes that came closer than 500 meters during their flight” SELECT A.id, B.id FROM Flights A,Flights B WHERE A.id <> B.id AND minvalue(distance(A.route,B.route))<500 • Example of spatio-temporal join Concrete Representation [Forlizzi et al., SIGMOD 2000] • Spatial objects: region (=finite set of polygons with holes) • Spatio-temporal (evolving) objects: – finitely many spatial slices (=time intervals) – coordinates are linear functions of time on each slice – segments may degenerate but cannot rotate within a slice – spatio-temporal object is a polyhedron. The ADT-approach: Conclusion • Change: both continuous and discrete • Closure: – for union: trivial, – for intersection, difference, selection, projection: depends on properties of polyhedra. • Object-orientation: – object types, operation signatures – object easy to implement in object-relational DB – no natural restriction on movement guarantees polyhedral structure • Queries: – good integration with SQL3, e.g., aggregation – expressiveness: difficult to establish – notoriously hard to optimize Spatio-Temporal Data in the Constraint Model One extra time variale: t t=0 Name MT1 Color white geometry 0x-t 0 y-t t+1x+y 0 t 4 Spatio-Temporal Data in the Constraint Model One extra time variale: t t=1 Name MT1 Color white geometry 0x-t 0 y-t t+1x+y 0 t 4 Spatio-Temporal Data in the Constraint Model One extra time variale: t t=2 Name MT1 Color white geometry 0x-t 0 y-t t+1x+y 0 t 4 Spatio-Temporal Data in the Constraint Model One extra time variale: t t=3 Name MT1 Color white geometry 0x-t 0 y-t t+1x+y 0 t 4 Spatio-Temporal Data in the Constraint Model One extra time variale: t t=4 Name MT1 Color white geometry 0x-t 0 y-t t+1x+y 0 t 4 Spatio-Temporal Data in the Constraint Model • Closure guaranteed • Natural query languages: FO+Poly, FO+While, Datalog with Constraints • Theoretically appealing and suited to study e.g., expressiveness of query languages • but few implementations (mostly with linear constraints [DEDALE, DISCO,COSMOS]) • Representation problem: a plane circles at constant speed • More restricted models: – parametric spatio-temporal objects – implementations (MLPQ, PReSTO [Revesz, SIGMOD 2000]) Spatio-Temporal Data : Representation problem An airplane circles at constant speed. z=constant x=x0+r.cos t y=y0+r.sin t Spatio-Temporal Query Languages •First-order logic over the reals: relational calculus extended with polynomial (in)equalities. “Do plane1 and plane2 collide at 5 o’clock above position (0,0)?“ (x)( y)( z)( t)(Plane1(x,y,z,t) Plane2(x,y,z,t) x = 0 y = 0 t = 5) •Extension of FO with a while loop: finite sequence of statements & while loops. • Statement: R := {(x,y,t)| (x,y,t)}; • While-loop: while do P; Queries in the Constraint Model • Closure: naturally available in the model • problems with aggregation, distance, … • no object-oriented features • Query langauges: • FO + extensions, • simple semantics, • expressiveness well-studied, • implementation requires constraint engine Parametric Spatio-Temporal Objects • An atomic geometric object consists of: – spatial reference object S (in R2) – reference time I (interval in R) – transformation function parameterized by time, i.e., f: R2xRR2: (x,y;t) (x’,y’)=f(x,y;t) • A molecular geometric object consists of a finite number of atomic objects Example: S=[0,1]x[0,1], I=[1,5], f(x,y;t)=(tx,ty) t=0 t=5 Parametric Spatio-Temporal Objects Example database: “classical” ID pict time interval from to dept arr Spatio - temporal Referentie Routeobject functie 1 A D 2h00 2h30 f1(x,y,t) 2 B C 1h30 2h30 f2(x,y,t) Parametric Spatio-Temporal Objects Semantics of an atomic geometric object (S,I,f) is {(x,y;t)| tI((x’)(y’)(x’,y’) S (x,y)=f(x’,y’;t))}, e.g.: ( , I, f(x,y,t) ) Semantics of an molecular geometric object is union of semantics of its atomic objects 3D figure is always a semi-algebraic set y t x And it is viewed as: Reference Objects arbitrary Poly Trax sides parallel to coordinate axes Rect Transformation Functions • Affinities: (x,y;t) a(t) b(t) x e(t) + f(t) c(t) d(t) y •Scalings: (x,y;t) a(t) 0 x e(t) + f(t) 0 d(t) y •Translations:(x,y;t) 1 0 0 1 x e(t) + f(t) y •Identity with a(t), b(t), c(t), d(t), e(t), f(t) linear, polynomial or rational Transformation Functions • Affinities: (x,y;t) a(t) b(t) x e(t) + c(t) d(t) y f(t) observe an passing object under an angle circling airplane •Scalings: (x,y;t) a(t) 0 x + e(t) 0 d(t) y f(t) observe an object that you are walking away from/to, oil spill, forest fire •Translations:(x,y;t) 1 0 x e(t) wind, transport 0 1 y + f(t) •Identity spatial data with a(t), b(t), c(t), d(t), e(t), f(t) linear, polynomial or rational Closure Properties under Boolean Operations Object O1: (the immediate proximity) of a ship Object O2: another ship (or a natural hazard) Object O1 O2 : danger zone Query: “Is there any danger of collision?” Parametric Spatio-Temporal Objects Example database: “classical” ID pict time interval from to dept arr Spatio - temporal Referentie Routeobject functie 1 A D 2h00 2h30 f1(x,y,t) 2 B C 1h30 2h30 f2(x,y,t) Closure Properties under Boolean Operations C A D B Closure Properties under Boolean Operations C A D B Closure Properties under Boolean Operations C A D B Closure Properties under Boolean Operations C A D B Closure Properties under Boolean Operations C A D B Closure Properties under Boolean Operations C A D B Closure Properties under Boolean Operations C A D Closure Properties under Boolean Operations C A D Closure Properties under Boolean Operations C A D B Closure Properties under Boolean Operations Which classes are closed under , , \? A class (S,F) is closed under iff the intersection of (the semantics) of any two molecular geometric objects from the class (S,F) can be described as a geometric object from (S,F). Property: A class (S,F) is closed under iff it is closed under for atomic objects. Closure Properties under Boolean Operations Which classes are closed under ? Poly Tr TraX Rect Affinity Scaling Translation Rat Pol Lin Rat Pol Lin Rat Pol Lin + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Id + + + + Closure Properties under Boolean Operations Which classes are closed under , \? \ Affinity Scaling Translation Rat Pol Lin Rat Pol Lin Rat Poly Tr TraX Rect Property: + + + + - - - - - - - - + + + - Pol Lin - - A class is closed under iff it is closed under \. Id + + + Closure Properties under Boolean Operations Closure Properties under Boolean Operations Example: A:rectangle B:rectangle A: (x,y) (x-t,y-t) B: (x,y) (x,y-t) Assume that both objects have the same time interval Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Different shapes of A B The intersection of two translating objects is a SCALING object that cannot be described as a union of translating objects (translation preserves length + area). Closure Properties under Boolean Operations This result is very intuitive, see example: Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations Closure Properties under Boolean Operations The movement of the intersection can be described by a scaling. The intersection of two linear scaling rectangles is a (union of) linear scaling rectangle(s) … Closure Properties under Boolean Operations Proof sketch: It can be shown that if 0 x e1 (t ) x a1 (t ) d1 (t ) y f1 (t ) y 0 and 0 x e2 (t ) x a2 (t ) d 2 (t ) y f 2 (t ) y 0 are the transformations of the original objects, the intersection’s transformation is described as: a1 (t ) xul a2 (t ) xlr e1 (t ) e2 (t ) 0 x xul xlr x b1 (t ) yul b2 (t ) ylr f1 (t ) f 2 (t ) y y 0 yul ylr (a2 (t ) a1 (t )) xul xlr e1 (t ) xlr e2 (t ) xul x x ul lr linear (b2 (t ) b1 (t )) yul ylr f1 (t ) ylr f 2 (t ) yul yul ylr Closure Properties under Boolean Operations \ Affinity Scaling Translation Rat Pol Lin Rat Pol Lin Rat Poly Tr TraX Rect + + + + - - - - - - - - + + + - Pol Lin - - The only class of transformations that can be used for Boolean operations. Id + + + MLPQ/PReSTO System [Revesz et al., SIGMOD 2000] • MLPQ: linear constraints • GIS: operations special to GIS objects • PReSTO (STDB): spatio-temporal database system • RECURSIVE \ Affinity Scaling Translation Rat Pol Lin Rat Pol Lin Rat Poly Tr TraX Rect + + + + - - - - - - - - + + + Rectangle based system - Pol Lin - - Id + + + PReSTO Example: Cloud over US PReSTO (STDB): spatio-temporal database system PReSTO Data definition in PReSTO: Nebraska() :- i=1, x1=550, y1=340, x2=570,y2=355, t>=0,t<=300,p=-1, s = 0. clouds(h) :- i=1,x1 - t = 80, y1 - 0.5t = 100, x2 - 1.1t = 220, y2 - 0.6t = 200, t >=0, t<=300, p = -1, s = 0, h = 0. Available operations/queries in PReSTO: •Union, intersection, difference, complement; •Projection, selection Parametric Spatio-Temporal Objects: Conclusion •Closure: for union trivial; for intersection and difference in some cases •Solution: restrict to polygons or work with Boolean combination of atomic objects (AND-OR-NOT trees, see CSG) •Advantage: - closure in definition - easier to construct more complicated objects (polygons with holes…) a d a b b c ab c d a\b Representing Spatio-Temporal Phenomena using ADTs or Constraints Problem with real spatio-temporal data: (A) data comes with discrete observations •Within a snapshot (TINs) •In different snapshots (B) data lacks clearly identifiable and delineated objects (C) modeled movement/evolution is often irregular (D) Data does not have regular (polyhedral) 3D structure Solution to (A), (C) and (D): •Convert each snapshot to a set of polygons •Interpolate/approximate between the snapshots Other Challenges Data models: • type system • representing uncertainty • integrating different representations • resolving inconsistencies Query languages and interfaces: • multi-dimensional aggregation (ST OLAP) • visualization • animation: explicit representations (ADTs) are more suitable than implicit ones (constraints) Moving Objects Databases (MOD) [Wolfson et al., 1997, Su et al., 2001] Data model: • point moving in nD plane (e.g., n=2) • finite chain of infinitely differentiable functions p:RR2 • MOD = finite set of moving objects Operations: • vel(p), acc(p), • moving direction =vel(p)/||vel(p)|| • speed =||vel(p)|| • distance = dist(p,q)= mint ||p(t)-q(t)|| • same direction Moving Objects Databases (MOD) [Wolfson et al., 1997, Su et al., 2001] Query languages: • SQL3 [Forlizzi et al, SIGMOD 2000] • relational calculus with built in functions [Su et al., SSTD 2001] • temporal logic [Wolfson et al, ICDE 1997] Expressiveness: • vel(p), acc(p), moving direction, speed, distance are enough to express all temporal and topological queries • PTIME languages •precision/uncertainty/probability Final Look Bottum-up: • adding spatio-temporal constructs to existing spatial/GIS systems to cope with application demands • closure problems, interoperability Top-down: • develop a general model (e.g., a constraint model) with clean semantics, properties and query languages • will it be used? More information at: http://alpha.luc.ac.be/~lucp1265/ [email protected]