Download gis databases - UMM Directory

Document related concepts

Big data wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
GIS DATABASES
an overview
Contents
– the basics of data storage
– overview of databases
• the database approach
• types of databases
• databases in GIS
– design considerations
– development of an ARC/INFO database
2
Conceptual, logical and physical ...
Conceptual
Logical
Physical
3
A storage hierarchy ...
– files/tables
• records
• fields(types …)
– databases
– information systems
– decision support systems (DSS)
increasing
complexit
y
– approaches to storage
• application/file based
• databases
4
Application based approach
Tax/Rates
Assessment
Assessment Data
Permits
Permit Data
Sewer
Maintenance
Sewer Data
Applications using data stored as Application Specific data
5
Tax/Rates
Assessment
Permits
Sewer
Maintenance
Database Management System
Database approach
Assessment Data
Permit Data
Sewer Data
Database approach and use of shared data implications for GIS
6
Database … a definition
• A collection of interrelated data stored
together with controlled redundancy to
serve one or more applications in an optimal
fashion.
• A common and controlled approach is used
in adding new data and modifying and
retrieving existing data within the data base
7
Databases… objectives/advantages
– centralised data storage and management …
global view of data … data dictionary
•
•
•
•
•
standardisation of all aspects of data management
reduced duplication
multiple access / retrieval flexibility
integrity constraints … validation enforced
...
– data base management system (DBMS)
8
Database/s… data dictionary
–
–
–
–
the most critical (?) element of a database
data about data… metadata
essential for system development
uses include
•
•
•
•
design - entities and data relationships
data capture - entry/validation
operations - program documentation
maintenance (impact assessment of proposed
changes , est. of effort, cost …)
9
Data dictionary…
types of information (general)
10
GIS Metadata
DBMS … key modules
– a data description/definition module
• defines/creates/restructures
• enforces rules
– a query module
• retrieval for queries, ad-hoc queries, simple reports
– a report writing program
– a high level language interface
– ...
12
Database… stages of development
– information systems plan for organisation
– system specification … user needs analysis
– conceptual design … data modelling
• hardware and software independent
– physical design … database design
– database implementation
– monitoring/audit
13
Database… stages of development
14
Organisational strategy and IT
Land Information System (LIS) (i)
– Problems/issues:
• rationalisation of land related information in
government agencies
• the removal/reduction of duplication
• introduction of economies in data capture,
maintenance and storage
• better (and wider) access to data
solutions .
..
15
Organisational strategy and IT
Land Information System (LIS) (ii)
– Solutions:
• better data distribution mechanism (data format and
location transparent to user)
• knowledge of data distribution built into the data
dictionary
• reduction of data duplication
• uniform query language (SQL)
• coding and data interchange standardisation ( …
SDTS)
16
Database types a history
Evolution of Database
technology
18
Database types - hierarchical (i)
– lends itself to GIS use as data are often
hierarchical in structure e.g. municipality x
province x country
– records divided into logically related fields …
connected in a tree-like arrangement
– master field in each group of records …
pointers … updates require pointers to be
modified
– fast preset queries … ad hoc queries difficult or
impossible
19
Database types
- hierarchical (ii)
COUNTRY (USA)
States
Counties
Boundaries
Nodes
20
Hierarchical Structure for a
Cadastral database
Hierarchical Structure for a
Cadastral database
Database types - network (i)
– similar to hierarchical but have multiple
connections between files to accommodate
many to many (M:M) relationships
– access to a particular file without searching the
entire hierarchy above that file
– linked records … quick preset searches … large
overhead in pointer management
– modification after creation difficult
23
Database types - network (ii)
24
Database types - network (ii)
25
Database types - relational (i)
– model developed from mathematics
– records and fields in a 2-dimensional table
– no pointers etc … any field can be used to link
one table to another
– normalisation … redundancy/stable structure
– ad hoc queries SQL… modifications easy
– not very efficient for GIS …SQL3
26
Database types - relational (i)
27
Database types - relational (iii)
28
Hierarchical structure
Network structure
Relational structure
(part…)
Centralised vs distributed
– a database does not necessarily mean a
centralised arrangement i.e. all data in one
physical place
30
GIS and distributed databases
...
– trend towards open systems ...
• special hardware and software can be used widely
… specific applications optimised
• system/network communications is easier
– modular implementation from an overall design
… incremental change
– unlimited capacity (nodes) … lower risks
31
Approaches to GIS system design
– develop a proprietary system
– develop a hybrid system: proprietary graphics +
commercial DBMS for attribute data (e.g.
ARC/INFO)
– use commercial DBMS and develop spatial
functions and graphics display used in
geographic analysis (e.g. siroDBMS, System9)
– develop a spatial DBMS from scratch
32
Approaches to GIS system design
33
(1) Separate Spatial and attribute data
Software
linkages
(2) Integrated Spatia
and attribute data
GIS databases … some problems (i)
– centralised risk
• centralisation demands better quality control other higher
potential for disaster
– cost
• large DBMSs are expensive to design, implement and operate
• piecemeal design is difficult
– complexity
• need to keep track of complex hardware and software
• need to keep track of graphical as well as attribute data and the
links
35
GIS databases … some problems (ii)
Cascading effects of change in a GIS database (ESRI 1989)
36
GIS Design
GIS database design guide
38
Objectives of design
– a good design results in a database which:
• contains necessary data but no redundant data
• organises data so that different users access the same
data
• accommodates different views of the data
• distinguishes applications which maintain data from
those that use it
• appropriately represents, codes and organises
geographic features
39
Design methodology (for ARC/INFO)
– conceptual model
• model the users’ view
• define entities and their relationships
– logical model
• identify representation of entities
• match to ARC/INFO data model
• organise into geographic data sets
– physical model
40
Design methodology (for ARC/INFO)
–
–
–
–
–
–
1. Model the users’ view
2. Define entities and their relationships
3. Identify representation of entities
4. Match to ARC/INFO data model
5. Organise into geographic data sets
41
1. Model the users’ view
– create a model of work performed by users for
which ‘location’ is a factor
• identify organisational functions
• identify the data which supports the functions
– organise data into sets of geographic features
• data function matrix
– high level classification of data
– interdependence of data and function
– difference between users and creators of data
42
Land development management function
43
Data function matrix …an example
44
2. Define entities and their relationships
– entities: distinguishable objects which have a
common set of properties
• identify and describe entities
• identify and describe the relationship among these
entities
• document the process
– diagrams
– data dictionary
• Normalise the data
45
Entity/relationship definition
46
Diagramming … entities
47
Normalisation
– First Normal Form (1NF)
– Second Normal Form (2NF)
– Third Normal Form (3NF)
ASR - Assessor
48
Underlying entities...
Parcel
Zoning Owner Ownership
3. Identify representation of entities
– determine the most effective spatial
representation for geographic features
– consider whether:
• a feature might be represented on a map
• the shape of a feature might be significant in
performing geographic analysis
• the feature will have different representations and
different map scales
• textual attributes of the feature will be displayed on
map products
• ...
53
4. Match to ARC/INFO data model
– determine the appropriate ARC/INFO
representation for entities
• points, lines, polygons
– ensure complex feature classes are supported
• route comprised of sections which in turn are based
on arcs
• a region is composed of polygons
• event is a point or a line which occurs along a route
– others (e.g. GRID, TIN)
54
Matching to ARC/INFO data model
Entity Spatial ARC/
type
INFO
Related Coverage Attribu Anno.
to
te files LUT
55
5. Organise into geographic data sets
– to identify and name the geographic data sets
that will contain the various entities:
• define the contents of geographic data sets
(coverages, grids etc)
• name workspaces, geographic data sets, entities and
attributes
• complete entity definitions
• add cartographic text and lookup tables
56
5(i) Define the content of geographic data sets
– Data sets supported : coverage, grid, tin, image
and drawing
– coverages several entities can be grouped into a
single coverage
– DBMS : stored in a separate database
management system
57
5 (ii) Geographic datasets, entities and attributes
– coverage definitions
• high level summary of the data physically stored in
the database
• required for defining the coverage structure
– file naming conventions in ARC/INFO
58
5 (iii) Complete entity definitions
– background information: coverage name, data
source, agency, number of records etc.
– attribute definition
• attribute name, type, field width
• validation rules/ permitted values
59
5 (iv) Cartographic text & code tables
– annotation (text, placing rules etc)
– look up tables
• pre defined set of values
• description/ labels
• means of creating displays based on attribute values
60
Robinson (Ch 14): Scale and GIS databases
– (past) map’s scale greatly influenced map
content and data resolution
– GIS data are ‘scaleless’ … scale is still a critical
factor with digital databases - because of the
ways in which we create digital databases
– scale and resolution (Tab 14.1)
61
Robinson (Ch 14): Scale and resolution issues
– symbolisation and display problems
– handling databases of different scales
• join problems (e.g. urban rural)
• merge problems (different themes)
• scale levels
– in general
– large scale data (AM/FM etc.)
62
Robinson (Ch 15): Managing large GIS
– Data organisation
• partitioning
• spatial indexes
• metadata
– data compression
• run length encoding (RLE)
• quadtree encoding
• others ...
63