Download curator database presentation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Clusterpoint wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Relational model wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Database model wikipedia , lookup

Transcript
“curator” DB design
Curator meeting, GFDL, Sep 20
Why RDBMS

A lot of information:








Model metadata
Experiments metadata
Institution/user metadata
Data metadata
Mostly it’s in textual form
Information is internally linked tightly that can be easy to
express by means of relational databases.
Relational databases have well developed means for searching
and extracting procedures (SQL query language and program
interfaces for any language) as for local as well as for remote
user.
Very reliable, safety technology.
2
Curator meeting, GFDL, Sep 20
Desirable Features of Model Data Factory

Relational Database storing metadata, containing
description of










XML as data exchange format





model components and model configuration
scenarios
postprocessing (model output and CMOR) directives
experiments
variables
formalized rules of Quality Control
data locations
task scheduler
users and groups accounts
for compliance with FRE
working format of existing third party software
good fitted for hierarchical metadata description
prevalent in world, easy to exchange with others Data Portals
Model Builder (FMS Runtime Environment in GFDL)






checks out available model components from DB
chooses model datasets from DB
sets postprocessing directives
checks components and configurations compatibility
builds executable application and runs it
write metadata about experiment into DB (model configuration, scenario, project,
organization/user, postprocessing)
3
Curator meeting, GFDL, Sep 20
Desirable Features of Model Data Factory
(continue)

Climate Model Output Rewriter (CMOR)
subsystem
 prepares data consistently with specific project requirements

Data Publisher
 transfer data to Data Portal storage in accordance to settings from DB

Data Portal Software Package






Configuration Manager (configures Aggregation Server and Data Portal Interface)
Search Catalog Engine
Data Subsampling Engine
Data Computation Engine
Data Visualization
Data Delivery Manager
4
Curator meeting, GFDL, Sep 20
Standard scenario of functioning Model Data Factory
(ideal picture)
 Scientist builds model in FRE using available model components, datasets
and forcing scenario.
 FRE puts metadata about built model, scenario, experiment into “curator”
DB and runs experiment;
 Postprocessing subsystem extracts metadata about postprocessing plan
from “curator” DB and executes it, and on finish puts metadata about
processed experiment back into DB.
 Data Publisher (DP) regularly checks “curator” DB for new experiments
marked as “public” and if finds any invokes CMOR.
 CMOR goes to “curator” DB for metadata and processes needed data
following metadata instructions.
 DP calls QAC and then transfers data to Data Portal storage.
 Configuration Manager configures Aggregation Server and Data Portal
Interface and puts records about new public data in “curator” DB.
 End of process, data is ready to go.
5
Curator meeting, GFDL, Sep 20
Common functionality schema of
‘Model Data Factory’
6
Curator meeting, GFDL, Sep 20
Database ‘curator ’ design
Database Compartments:

Model Metadata Compartment
contains models’ descriptions, allows to build coupled model of needed configuration

Variables Compartment
List of all related physical variables

Workflow Compartment
contains scenarios, experiments, institutions, projects and users info

Postprocessing Compartment
defines postprocessing plan for conducting experiment

Data Portal Compartment
contains info about experiments data
7
Curator meeting, GFDL, Sep 20
MySQL DB CURATOR
8
Curator meeting, GFDL, Sep 20
Model Metadata Compartment
(in development)
Coupled_Models
Component_Medias
Model_List
Models
Workflow Compartment
Experiments
Variables Compartment
Variables
9
Curator meeting, GFDL, Sep 20
Data Samples from Model Compartment
Coupled_Models
Components_Medias
Model_List
Models
10
Curator meeting, GFDL, Sep 20
Variables Compartment
Variables
Variable_Bundles
Variable_Lists
Variable_List_Contents
Workflow Compartment
Proj_Var_Names
Projects
11
Curator meeting, GFDL, Sep 20
Data Sample from Variables Compartment
Proj_Var_Names
Variable_Lists
Variables
Variable_List_Contents
Variable_Bundles
12
Curator meeting, GFDL, Sep 20
Workflow Compartment
(in development)
Institutions
GFDL_USERS
Experiment_Status
Experiments
Projects
Scenarios
Realization
13
Curator meeting, GFDL, Sep 20
Data Samples from Workflow Compartment
Scenarios
Experiments
14
Curator meeting, GFDL, Sep 20
Postprocessing Compartment
Post_Proc
PP_Units
Coupled_Models
Projects
GFDL_USERS
PP_Content
Average_Periods
Variable_Lists
Data Samples from Postprocessing Compartment
PP_Units
PP_Content
15
Curator meeting, GFDL, Sep 20
Data Portal Compartment
Data_Files
Data_Grids
MissedData_Descriptors
Coupled_Models
Variable_Bundles
Experiments
Variables
Curator meeting, GFDL, Sep 20
16
Data Samples from Data Portal Compartments
Data_Files
MissedData_Descriptors
Data_Grids
17
Curator meeting, GFDL, Sep 20
“curator” DB is in use now:


CM2.0
CM2.1
18
Curator meeting, GFDL, Sep 20
Future Development



Bring DB terms to conventional terminology.
Set up model metadata schema standards and create
tables in “curator” DB following this schema.
Fill these tables with real metadata extracted from models
of GFDL, CCSM, MIT and from ESMF Component Database.

Implement tables for observation data metadata.

Implement DODS aggregated data support.

Build XML bridge for XML transcoding DB input/output
19
Curator meeting, GFDL, Sep 20
END
Questions?
Suggestions?
Objections?
Thanks!
20
Curator meeting, GFDL, Sep 20