Download Folie 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big data wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Concurrency control wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Relational model wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Welcome to the
st
1 GLOWA-Volta
Database
Workshop
Antonio Rogmann (ZEFc)
1
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Agenda
• Aims of the workshop
• Deficits relating the datastocks and data management of the GVP
• Datamanagement
• Livecycle of data
• Conclusions for the GVP
• Need for integration of the data users to database developement
• Role of disciplines to data management
• Steps forward to an optimized data management
Antonio Rogmann (ZEFc)
2
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Aims of the workshop
• Initiation of a dialogue with the GVP-members about their requirements to
an efficient data management
this dialogue is a process in which the following items should be
discussed
• data
• use and access
• database structure
• metadatabase
• webpresence
• database team and division of work
Antonio Rogmann (ZEFc)
3
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Aims of the workshop
These points should be discussed within the working groups as far as
possible. In this workshop we are focussing the items
- data (data flow)
- data use and access
- set up of a „database team“ and division of work
Technical implementation, structure and type of the databases, including
ways of access should be developed in a team by members of the
departments as well as computer scientists and project leaders!
Antonio Rogmann (ZEFc)
4
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Deficits relating the datastocks and data
management of the GVP currently
Antonio Rogmann (ZEFc)
5
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
The current situation
Data server
• data stock is not completed
• data searching by criterias
is not possible
• arrangement of data is unclear
• relation to the project is
unclear
• there are no rules for data
uploading (location, topic etc.)
Antonio Rogmann (ZEFc)
6
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
The current situation
Data mediums
• what is it‘s content?
• to which project/thesis
does it belong to?
?
Antonio Rogmann (ZEFc)
7
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
The current situation
Metadatabase
• data stock representation
is not completed
Antonio Rogmann (ZEFc)
8
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
The current situation
Metadatabase
• if you are looking for data,
you have to ask your
colleague in and outside of
ZEF!
• maybe the contact person
is not available
Antonio Rogmann (ZEFc)
9
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
The current situation
Metadatabase
• blind links
Antonio Rogmann (ZEFc)
10
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
The current situation
Datasets
• inconsistency
Antonio Rogmann (ZEFc)
11
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
The current situation
Datasets
• lack of data description
• which method background?
?
• are the values correct?
Antonio Rogmann (ZEFc)
12
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Data management
• For the avoidance of such problems there is the necessity of
datamanagement
• Definition (by the „Data Management Association“):
„Data Resource Management is the development and execution of
architectures, policies, practices and procedures that properly manage
the full data lifecycle needs of an enterprise“
• Normally the processes of data management should be implemented
within a project, when it starts!
Antonio Rogmann (ZEFc)
13
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Structuring
(Data modeling)
- Categorisation
- Sortation
- Description
Procurement
and
Storing
Administration
- User Access Rights
- Security
- Own investigation
- Own processing
- Supply from other
institution/Project
Lifecycle of Data
and Aspects of his
Management
Disposal
Use and
Processing
- Data processing
- Content Management
- Quality Assurance
- Data preparation
(for others)
- Update or
- Erasure
Distribution
- Access
- Deliver
Antonio Rogmann (ZEFc)
14
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Lifecycle of Data: Procurement of Data
can happen from
• own investigations
• other institutions
• other (sub-)projects within the main project
serves
• for providing the operating processes with input data
needs
• certain data sources and formats
• quality
• application interfaces (import)
Antonio Rogmann (ZEFc)
15
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Lifecycle of Data: Structuring and Storing
means
• sorting of data related to a classification schema
• by themes
• by projects/subprojects
• by formats
• by applications
• by spatial research area
• .....
Antonio Rogmann (ZEFc)
16
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Lifecycle of Data: Structuring and Storing
or/ and
• by a conceptual data model
• it obtains the data entities and their
relationships within a scope of a system
• the entities have properties (attributes)
• it is independend of the storing in a database and other technical
requierements
• it can be designed in different forms (relational, network,
hierarchical)
• the target system for data storing can be a relational database as
well as a file system
Antonio Rogmann (ZEFc)
17
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Lifecycle of Data: Structuring and Storing
serves for
• easy search, find and use of data
needs
• consensus among data producers and users
within an organization about
• conceptual data model
• data needed and not needed
• rules about data updating and archival storage
• standards for metadata-content
• control of compliance to structure criterias
Antonio Rogmann (ZEFc)
18
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Lifecycle of Data: Structuring and Storing
means
• the physical storing of data
needs
• storage places for the databases (central/distributed)
• physical data model
• derived from the conceptual/logical data model
• takes into account the facilities and constraints of a given
database management system
• database management system with
• interfaces for applications
• query and search services
• backup and security functions
Antonio Rogmann (ZEFc)
19
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Lifecycle of data: Administration
means
• on technical base
• install and maintenance of database system
(database + database management system)
• user access constraints (rights)
• back up and archiving tasks
• security
• performance
• on content base
• Integrity - verifying or helping to verify
• control of data deliver
• control of data input
• metadata
Antonio Rogmann (ZEFc)
20
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Lifecycle of data: Administration
needs
• cooperation between data producers/users and administrators for
• maintenance and upgrading the database(-schema)
• definition of the authorization concept for database access
(read only, read/write only, database schema modification etc.)
Antonio Rogmann (ZEFc)
21
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Lifecycle of data: Use and Processing
means
• use of data for analysis
• processing of data inside and outside of models
• production of new or modified (output-)data
• control of data accuracy
• preparation of data for other processes/projects
Antonio Rogmann (ZEFc)
22
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Lifecycle of data: Distribution
means
• delivery of data
• inside an organization/project
• by storing in a database (access by transfer counterpart)
• transfer by a portable media
• by publishing the metadata
• outside an institution/project
• by direct access to a database
• Web-Services
• publishing the metadata
• data extract service from a database
• data downloads
• Map Services (geodata)
Antonio Rogmann (ZEFc)
23
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Lifecycle of data: Distribution
serves
• inside an organization/project
• for providing work processes with adjusted data
• outside an organisation/project
• for providing work processes with adjusted data
• for providing data for public information about the projects
needs
• knowledge about the requierements of demand concerning
• further use of data
• formats
• clients
• ...
Antonio Rogmann (ZEFc)
24
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Lifecycle of data: Disposal
means
• updating the data
• selection and deleting or archiving of data
• being out of date
• being in disuse
serves
• against data overflow into the databases
• for maintenance the quality of data
needs
• cooperation between the data producers/users and the
database administrator
Antonio Rogmann (ZEFc)
25
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Conclusions for the GVP
• Conditions
• GVP is divided in a range of projects and subprojects
• e.g. in Phase II „Land Use“ with subprojects L1, L2 etc.
• e.g. in Phase III „Analysis of Long-Term Environemental
Change“ with the subprojects E1, E2 etc.
• with their own processings, models, input and output data
(- formats) data flows and -storages
• with specific integrations and dependencies among each
other and within „use case“ frameworks
• Projects and their models are provided also with data from
different scientific disciplines like Hydrology, Pedology, Social
Economy, Ecology etc.
Antonio Rogmann (ZEFc)
26
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Conclusions for the GVP
• Conditions
• in Phase III main objective ist the „Integration of Phase I
and II research results, knowledge, data and tools“*
• in Phase III the DSS will be realized as the GVP‘s primary output
The several subprojects are connected by data flow
(transfer)
The data flow should be adjusted to the GVP and DSS
requierements. This means there must be a
transparent management, which is centralized and
standardized
Antonio Rogmann (ZEFc)
27
* GVP Phase III Proposal, S. 8
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Need for integration of the data users during
development and setup of a GVP-data
management
• Each researcher (or on a higher level: project) is a kind of data manager in
his own work space. He has
• is own (local) database
• his own input and output data and data procurement requierements
• his own usage and processings
• his own distributing of data (to other users/projects)
• and therefore his own (short) lifecycle of data
• and is integrated in the data flow between the projects and also their life
cycle of data
Antonio Rogmann (ZEFc)
28
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Data flow
Project
Project 23
Project 4
Project 1
Project 1
Central
Database
Project 3
Data
flow
Project 4
Antonio Rogmann (ZEFc)
29
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Role of disciplines in developing concepts of
a data management
project members....
• have to decide, together with other project members and the
database developers, which data should be stored centrally to share
them, and which can be stored locally or at other places
• have to decide which structure of data storing is most convenient for an
optimized using
• have to give information about their data (create metadata)
and
• they are responsible for the data management in their own work area before they will be interdisciplinary coordinated by the database
administrator
Antonio Rogmann (ZEFc)
30
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Role of disciplines in developing concepts of
a data management
developers of a database ....
• have the responsibility to consult the project members about the
requirements of data management
• have to organize the data flow concerning the (technical) way of data
storing and access. The activities must be adjusted to the operating
processes/projects and their interfaces
• have to develop the data management standards together with the projectmembers
Antonio Rogmann (ZEFc)
31
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Steps forward to an optimized data management
(within this workshop)
My request to you
Step 1: analyze the data stock (data dictionary)
Step 2: analyze the data flows
Step 3: develope the logical data model for
data storing
Antonio Rogmann (ZEFc)
32
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Basic for working groups
Data flow modell combined with data dictionary
Notation :
A
a
a
Antonio Rogmann (ZEFc)
=
Terminator: data producers (data source) or users (data
hollow) outside the system (external Partners, public)
=
Process: transfer of input data into output data e.g. by
algorithms
=
=
=
Data storage unit as data pool (not local). Building time differs
from using time. „A“  dictionary
Data flow: direction for dataset „a“  dictionary
Data flow: relay in two directions (processes)
33
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Basic for working groups
Context-Diagram
External
Partner
Decision
Makers
GLOWA-Volta
External
Partner
Antonio Rogmann (ZEFc)
Public
34
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Basic for working groups
Diagram 1: GLOWA-Volta
Water Demand
and Management
External
Partner
Water Supply
and Distribution
DSS
External
Partner
Analysis of
Long-Term Env.
Change
Antonio Rogmann (ZEFc)
35
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Basic for working groups
Diagram 2: Analysis of Long-Term Environmental Change
Vendor of remote
sensing data
Automated
Classification of
Remotely Sensed
Imagery (E1)
Cellular
automata (E2)
GVP LUDAS
(E3)
Land-use Change
Predictions and LU
Policy
Antonio Rogmann (ZEFc)
S1
36
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Basic for working groups
Diagram 3: GVP-LUDAS
working group: natural scientists
E1
working group: social
economists
a, b, c
Elicitation
Ghana
GVP-LUDAS
d
g
A
e
f
Evaluation of
Elicitation Results
(Household Survey)
E4
Antonio Rogmann (ZEFc)
37
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
To Do
• Please try to draw a general overview about data flows and stocks
• And relate data management options to the certain data flows or
storages
In the afternoon I would like to discuss the requirements of a data
management system from your point of view.
Take it all as a form of brainstorming!!
Thank you!!
Antonio Rogmann (ZEFc)
38
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
39
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
How to organize (sort) the data into the database ???
Central Database
Project 1:
- theme 1
- format 1
- format 2
- theme 2
....
Formats
- SPSS
- project 1
- project 2
- remote sensing
....
Region 1
- Project
- subproject
- theme
- format
Project 1:
- theme 1
- format 1
- format 2
- theme 2
....
Antonio Rogmann (ZEFc)
40
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Basic for working groups
Data flow modell combined with data dictionary
Notation II:
a
b
a
c
b
a
c
a
Antonio Rogmann (ZEFc)
=
=
Dataflow: relay in two directions (processes)
=
Dataflow: „a“ is originated from „b“ and „c“
=
Dataflow: updating of data to a storage
Dataflow: division from dataset „a“ into datasets „b“ and „c“
41