* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download DataCube Discussion7/29
Survey
Document related concepts
Transcript
Earth Cube Discussion7/29 DSS perspective Data Life Cycle, ACCI Report and in many others RDA functions are: • Document • Organize • Protect • Access Already Enables reuse/repurposing and provides RoI. DSS Starting Points • Large and small diverse data collections – Diverse in: content, formats, parameter consistency, resolutions, etc. • Uniform metadata standard managed in a DB framework (GCMD) – Facilitates complete cross-RDA discovery – Harvest content metadata for most data files • Provides user selectable choices => files that satisfy constraints plus drive “fast/efficient” data extraction DSS Starting Points • Large and/or high impact science data categories – TIGGE, Operational NWP Analyses, Seven Reanalyses • • • • Serve both “big” and “small” science Measure re-use for individual RDA entities Scalable archiving and curation (RDA-DAMS) Broad service environment: web, DAV, HPC – Foundation for data-centric CI services • Leverage CISL resources, DAV, HPC, GLADE, HPSS DSS Starting Points ACCI Task Force Report, pg. 24, Data Management Guidelines “What does ‘good’ look like?” –Develop standards for data management policy CISL RDA knows exactly what good looks like. • Track record for 40+ years • Transitioned many IT and service implementations – Media migration, servers, networking, metadata, etc. DSS – what is next? • ACCI impact (pg. 15) improved NSF sponsored infrastructure to accelerate scientific start up, get greater return on investment –less infrastructure required by individual researchers at Universities, less duplicity of effort. • Dissolve the data format barrier – Conversion tool library (everyone receives formats and resolutions they want) • Factor in multi-disciplinary metadata vocabulary translation • Near-immediate user selected data extraction from TB+ datasets (research to operations) – HPC, parallel processing and I/O, fast storage – Better support for both “big” and “small” science DSS – what is next? • Create API and web service protocol standard with other climate data centers, in collaboration with industry application developers. • Step beyond NCAR home-grown data portals • Easier to serve larger community • Community designed applications that draw RDA assets • Allows for broader cross-disciplinary scalability – Metadata DB interworkability – Application driven interworkability DSS – what is next? • Create generalized archiving tool for initial research data dumps by individual scientists (i.e. raw model output or observations that may not be suitable for long-term curation) – Standardized storage structure – Basic descriptive information stored in metadata databases – Allows others to discover/identify data without the need for “inside knowledge” –open access – Mitigates problem of orphaned data – Could be implemented with cloud storage services