Download Summary

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Open Database Connectivity wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Oracle Database wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Database wikipedia , lookup

Functional Database Model wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
Geographical information systems
Paper on:
Geographical Database
By
Myriam Benkirane
Supervised by : Dr. D. Kettani
Outline:
Introduction
IWhat is a Geographical Database?
IIData types.
III- Spatial Models.
IV- Spatial database types.
VSpatial relationships.
VI- Spatial Database Elements.
VII- Querying.
VIII- Data structure and algorithms.
IX- Spatial indexing.
XSystem design and architecture.
XI- Tools and standards.
XII- Geo-database in Morocco
Summary
Introduction:
A geographic information system (GIS) is a computer-based information
system that enables capture, modeling, manipulation, retrieval, analysis and presentation
of geographically referenced data. Hence, Data is one the fundamental component of GIS
technology. One of the major problems that GIS technology was facing is data
acquisition. Some of data is still on papers (maps), some are stored on files but it are
unstructured and not well presented. At a first glance, relational database were used,
however they required huge space and time processing. Spatial database solved the
problem and allowed data to be integrated into a single, uniform and efficient data store.
What is a special database?
A standard database management system was and is still used to store, update and
retrieve standard data. With the emergence of the GIS (geographic information system)
technology, storage and analysis of spatial data became a necessity and requirement.
Relational database system attempted to implement and manage spatial database,
however storing spatial data in a standard database required an excessive amount of space
and a longer time for retrieval and analysis of spatial data. A spatial database addresses
these needs and provides all the features required by GIS and other spatial related
applications.
In the 90s, pictorial -also called image - databases were used to support spatial
data types; they store the spatial data in the form of raster images (images, pictures...,
etc). Still, the growing need to deal with data as objects and not as images gave rise to the
spatial databases. A spatial database is database system that includes spatial data types
(for example point, line, region) and their relationships in its data model and query
language. It also provides intelligent and efficient algorithms to implement spatial
indexing and spatial join methods. Spatial databases are the fundamental technology for
GIS and other applications.
In This paper, I am going to present the different concepts of spatial databases,
I will start by defining the data types, elements, relationships, and models, then I will
consider how querying, indexing, spatial join are implemented in spatial database, finally
I will describe data structures and algorithms that can be used as tools or building blocks
within different system architectures.
.
Data Types:
Data can be classified into two types of data models:
Figure 1
Vector model
Figure 2
Raster Model
Vector model: It displays graphical data as points, lines or curves, or areas with
attributes. Cartesian coordinates and computational algorithms of the coordinates define
points in a vector system. Lines or arcs are presented as series of ordered points whereas
areas or polygons are stored as ordered lists of points. Vector data requires less computer
storage space and maintaining topological relationships is easier in this system. (See
figure 1).
Raster Model: A raster based system displays, locates, and stores graphical data by using
a matrix or grid of cells. These data are two-dimensional; GIS store information such as
forest cover, soil type, land use, or other data in different layers using the raster model.
Raster data requires less processing than vector data, but it consumes more computer
storage space. (See figure 2).
Spatial Models:
Modeling in GIS involves joining the spatial database to a computer-driven model
of some process or procedure. A model is a representation of the real world using a series
of mathematical formulas. Modeling allows either two dimensional or three dimensional
spatial data to be easily manipulated and processed. Two views can be presented, either
objects in space such as cities, rivers, buildings or space itself such as land partition of a
country, etc.
Line
Point
Region
Figure 3
Two modeling concepts were derived from these views and are supported by the
spatial DBMS (database management system); the first one is related to single object and
the second one is related to spatially related collections of objects.
Single objects are represented either by a point, a line or a region. A point
represents an object for which only its location in space is relevant for instance cities. A
line represents the concept of moving through space or a connection in space for example
roads, rivers, etc). A region represents two dimensional objects in space (country, lake).
Figure 1 shows the three types of single object modeling.
Partitions and networks belong to the spatially related collections of objects. A
partition can be considered as disjoint region objects on which adjacency relationship is
often of concern; partitions are used to represent thematic maps. A network is an
embedded graph in a plane consisting of set of points (vertices) and lines (edges) objects;
it is used to represent highways, power supply lines, rivers, etc.
Euclidean geometry is used to represent the various abstractions described above.
For instance a point will be given a pair of real number coordinates. This can leads to
errors because processors do not use real numbers. A query for finding the coordinates
corresponding to a point of intersection will return a wrong result. Approximations are
often used to solve such problem and depend on how the indexing in the spatial database
is implemented.
Spatial Database types
Spatial data types or spatial algebra define the abstractions point, line and region
together with relationships between them. The Rose algebra is an example of the special
algebra and includes the three data types: point, line, and region. Two sets are defined,
the first one is EXT= {lines, regions} and GEO= {points, lines, regions}.
There are four classes of operations performed on a spatial data types:
1. Spatial predicates expressing topological relationships:
for each geo in GEO. for each ext1 , ext2 in EXT. for each area in regions area-disjoint
geo regionsbool
inside
ext1 ext2 bool intersects, meets
areaarea bool adjacent, encloses
The first operator “inside” checks if a point, line or region is inside a region and returns a
Boolean expression. The “intersects” or “meets” operation tests whether two elements of
the same or different types within the set EXT intersect. Finally, the “adjacent” operator
is applicable on regions belonging to a partition.
2. Operations returning atomic spatial data type values:
for each geo in GEO.
lines lines points
intersection
regions regions regions intersection
geogeo geo
plus, minus
regions lines
contour
3. Spatial operators returning numbers:
for each geo1 geo2 in GEO.
geo1 geo2 real
dist
regions real
perimeter, area
4. Spatial operations on sets of objects:
for each obj in OBJ. for each geo, geo1 , geo2 in GEO.
set(obj) (obj geo) geo
sum
set(obj) (obj  geo1) geoset(obj)
closest
“Sum” is a spatial aggregate function and “closest” operator computes the minimum
distance between objects.
Spatial Relationships
A major objective of a GIS is to develop spatial relationships between mapped
geographic features. Possible relationships between objects are defined by the
intersections between point-sets representing the geometric objects. Three classes can be
defined. They are mutually exclusive and cover all possible cases.
Topological relationships: such as adjacent, inside, disjoint.
Direction relationships: for example, above, below, north_of, southwest of, etc.
Metric relationships: such as “distance < 100”.
The spatial database is a relational DBMS extended by spatial data types and
spatial relationships. It can represent spatial objects such as city, road, river in addition to
the usual data types (integer, string, etc).
Example of relations:
relation states (sname: STRING; area: REGION; spop: INTEGER)
relation cities (cname: STRING; center: POINT; ext: REGION; cpop: INTEGER)
relation rivers (rname: STRING; route: LINE)
How does it work?
Figure 4
Spatial data is stored using the coordinate system of a particular projection, that
projection is referenced with a Spatial Reference Identification Number (SRID), this
number corresponds to another table in the database with all of the spatial reference
systems used (see figure 4).This allows the database to know what projection each
table is in, and to re-project those tables for calculations. GIS links between spatial
(point in a map) and non spatial data (tabular data) by the SRID .Every geographic
feature has at least one unique means of identification: a name SRID. In other words,
locational information is linked to specific information in a database.
Spatial database elements:
Entity: a phenomenon of interest in reality that is not further subdivided into phenomena
of the same kind for instance a city.
Object: a digital representation of all or part of an entity. (City may be represented by a
point or a region)
Entity types: similar phenomena to be stored in a database are identified as entity types.
(road, river…)
Attribute: an attribute is a characteristic of an entity selected for representation.
Layers: spatial objects can be grouped into layers, also called overlays, coverage or
themes
Metadata: Metadata is a summary document providing content, quality, type, creation,
and spatial information about a data set. It can be stored in any format such as a text file,
Extensible Markup Language (XML), or database record.
Spatial Reference System table: table in the special database where all Spatial Reference
Identification Numbers are stored.
Querying:
Querying in spatial database is equivalent to querying in a standard database. It
involves connecting the operations of the spatial algebra to the facilities of a DBMS
query. Spatial selection and spatial join are the fundamental ones. A spatial database
should provide also graphical presentation of spatial data or results of queries, and
graphical input of spatial data types values used in queries.
Spatial selection: an operation that returns objects satisfying a spatial predicate with the
query object.
Examples:
“All cities in Morocco”
SELECT sname FROM cities c WHERE c.center inside Morocco.area.
“All big cities no more than 100 Kms from Hagen”
SELECT cname FROM cities c WHERE dist(c.center, Hagen.center) < 100 and
c.pop > 500k
Spatial join: a join which compares any two objects through a predicate on their spatial
attribute values.
Example:
“For each river, find all cities within less than 50 km.”
cities rivers join[dist(center, route) < 50]
There are also operations for the manipulation of partitions (thematic maps):
Overlay: Computes the elementary regions resulting from overlaying two partitions.
Fusion: Objects are grouped by some arbitrary attribute values.
Voronoi: theVoronoi diagram -> For each point p, the region consists of the points of the
plane closer to p than to any other point in space.
Graphical representation of spatial data types values returned by some query is of great
concern in a spatial database. Besides, SDBMS should be able to output graphically the
combination of several queries.
I am listing here a number of agreed upon requirements for spatial querying:
- Spatial data types.
- Graphical display of query results.
- Graphical combination (overlay) of several query results.
- Display of context (e.g., show background such as a raster image (satellite image) or
boundary.
- Legend should clarify the assignment of graphical representations to object classes.
Data Structure and algorithms:
An important issue in spatial database systems is the integration of the spatial
algebra with database management system querying, for example representing a spatial
data type should be compatible with the DBMS view, and the spatial algebra view.
The representation from a DBMS view:
- is the same as that of attribute values of other types with respect to generic
operations.
- can have varying and possibly very large size.
- resides on disk and is stored in one page or a set of pages.
- can efficiently be loaded into main memory.
The representation from a spatial algebra view:
- is a value of some programming language data type, e.g. region.
- is some arbitrary data structure which is possibly quite complex.
- supports efficient computational geometry algorithms for spatial algebra
operations.
In addition to the point discussed above, the spatial DBMS should support:
Approximations: stores some approximations (e.g. MBR – Minimum boundary
rectangle) to speed up operations
Stored unary function values: such as perimeter or area can be stored once the object is
constructed to eliminate future expensive computations.
Spatial Indexing:
The next issue is that the implementation of the operations is using computational
geometry in addition the query processing access method or spatial indexing that support
spatial selection and spatial join.
Spatial indexing supports all kinds of spatial queries and especially spatial selection and
spatial join, spatial indexing organizes objects and space so that only part of the space
will be considered to answer a query. Two methods were defined for spatial indexing,
either external spatial data structure are added to the spatial DBMS, or the spatial object
are mapped into one-dimensional space. Approximations are the fundamental process
used by spatial indexing. There are two ways of approximations:
- Continuous approximations: based on the minimum boundary rectangle and it is
based on the coordinates of the object enclosed by the rectangle.
- Grid approximations: Space is divided into cell and the object is represented by
the set of cells that it intersects with.
Spatial indexing take place in to steps:
Filtering step: Find all the MBRs that satisfy the query.
Refinement step: For each qualified MBR, check the original object against the query.
We can conclude that spatial data structure are either stored as a set of points
(point value) or a set of rectangles ( for line and region value).
These are some queries supported by points:
Range query: Find all points within a query rectangle.
Nearest neighbour: Find the point closest to a query point.
Distance scan: Enumerate points in increasing distance from a query point.
These are the queries supported by rectangles:
Intersection query: Find all rectangles intersecting a query rectangle.
Containment query: Find all rectangles completely within a query rectangle.
System Design and Architecture:
The system architecture should integrate tools to support spatial data types, spatial
relationships and spatial indexing. The standard DBMS should be extended to allow for:
- Representations for the data types of a spatial algebra.
- Procedures for the atomic operations.
- Spatial index structures.
- Access operations for spatial indices.
- Filter and refine techniques.
- Spatial join algorithms.
- Spatial data types and operations within data definition and query language.
- User interface extensions to handle graphical representation and input of SDT
values.
Research was investigated to create a relational database management system
including the points described above. Besides, the user of the extended database should
find no difference concerning the data types, operations, relationships, querying, and
indexing methods principles. The new system architecture -the integrated architectureshould tread them in the same way.
Building a spatial database from scratch is complex and expensive, However
building a extended relational database is more efficient because it allow for adding more
functionalities and is open to continuous development and research.
Example of system architecture
Spatial data is stored in an SQL
Server 7.0 database.
ArcGIS update spatial data.
ArcGIS perform spatial analysis,
develop and plot maps as needed.
CadClient facilitate exchange of data
between the spatial database and
AutoCAD.
ArcIMS deploy spatial information
over the web.
ArcSDE hold all the pieces together.
Tools and standards:
The most famous and used ones are:
ArcSDE: ArcSDE is a geographic application server that uses DBMS to store
vector, raster and survey data. ArcSDE is able to manage large spatial data sets with
common standard and data models. ArcSDE supports ArcGIS geo-database model and is
integrated with the functions of a spacially enabled database such that IBM's DB2 Spatial
Extender, IBM Informix's Spatial DataBlade, Oracle Locator, and Oracle Spatial. It
allows the geodatabase to use the extended spatial types of a spatially enabled DBMS to
store and manage feature geometry. ArcSDE works with IBM DB2, IBM Informix,
Microsoft SQL Server, and Oracle.
Oracle Spatial and locator & Oracle locator: Oracle Locator is a feature of
Oracle Database 10g Standard and Enterprise Editions that provides core location
functionality needed by most customer applications to support a variety of location-based
services (LBS) and 3rd party GIS solutions. Oracle Spatial is an option for Oracle
Enterprise Edition that provides advanced spatial features to support high-end GIS and
LBS solutions. Oracle Map Viewer is an Oracle Application Server Java component and
JDeveloper extension used for map rendering and viewing geospatial data managed by
Oracle Spatial or Locator.
Query languages used:
Extend SQL : GEOSQL: GeoSQL works with all major GIS formats including ESRI,
Mapinfo, Autodesk as well as Oracle, and ODBC. It has the power to capture large
amounts of data and manipulate it into usable information.
Graphical language: GEO-SAL: also a query language for spatial data analysis.
Spatial database follows OpenGIS standards, for instance any contract requiring
internet deployment of spatial data in the form of map should be undertaken using Open
GIS standards. These standards for geospatial information are developed and
administered by the OpenGIS Consortium (OGC) which supports interoperable
geospatial data integration and services over the web.
Geo-Database or GIS data in Morocco:
Actually, GIS is a technology that is evolving in Morocco. However, free data is not
available which makes it difficult for developers to design an efficient and complete GIS
application.
You can buy GIS data about Morocco from this site:
http://data.geocomm.com/catalog/MO/
Or from:
http://www.geo-strategies.com/digitaldata/availability/world/morocco.htm
Summary
In this paper, I tried to present the major concepts of spatial databases and spatial
database systems. Spatial database is the extended relational database model, it provides
spatial data types, spatial data models and operators, it offers methods for efficient
processing of special queries, it support spatial indexing, spatial selection and spatial join.
Nowadays, spatial databases are widely used with web hosting applications, data hosting,
data management for desktop applications and finally custom applications. I conclude
that without a spatial database, GIS as well as well as other applications can not be
designed efficiently.
References:
An introduction to database management systems – by Ralf Hartmut Güting
Spatial database system – Tutorial Notes.
http://www.clr.utoronto.ca/VIRTUALLIB/KBASE/gis_db.html
http://www.colorado.edu/geography/gcraft/notes/datacon/datacon_f.html
http://www.main.nc.us/GIS/guide/using/tutp4.html
http://www.gis.com/data/data_sources.html