Download SeaDataNet and EMODNET Vocabularies

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Semantic Web wikipedia , lookup

Data model wikipedia , lookup

Data vault modeling wikipedia , lookup

Business intelligence wikipedia , lookup

Operational transformation wikipedia , lookup

Database model wikipedia , lookup

Formal concept analysis wikipedia , lookup

Transcript
Roy Lowry
Adam Leadbetter
British Oceanographic Data Centre
Overview
 Automated parameter aggregation (P35)
vocabulary status
 EMODNET chemical filter
 P01 semantic model exposure status
 Management of concept deprecations
P35 Status
 P35 is a vocabulary of parameters for EMODNET
chemistry lot products
 EMODNET data parameters are marked up using P01
vocabulary
 P01 is much finer grained than P35
 Therefore aggregation of P01 parameters into P35
parameters is required
 To date this has been done by a lot of painful manual
work in ODV
P35 Status
 However within NVS it is possible to maintain and
serve a mapping between P01 and P35
 Each P35 concept has a URL that resolves to an
XML document
 This document can be used to drive automated
parameter aggregation by identifying all P01 codes
that may be incorporated into a P35 code
P35 Status
 P35 presents design issues
 P35 granularity (e.g. should there be separate products
for unfiltered and filtered samples)
 Which P01 terms should map to a given P35 term?
 Design issues need governance - domain experts who
can make these decisions
 Governance now established based on experts from
EMODNET partners communicating by list-server email and Webex.
P35 Status
 Example P35 concept (ITS90 temperature) set up in
October
 Just over 100 additional entries considered by governance
and loaded this month
 These cover




Salinity
Dissolved oxygen
Nutrients
Metals in the water column
 Next target is metals in sediments and biota (900-1000)
 P35 could easily reach several thousand entries
EMODNET Chemical Filter
 Need to consider what is required here
 One approach is to specify a list of P02 codes that cover
the themes included in the EU legislation
 This comes with risks
 Some data outside the intended scope will be captured (e.g.
Methylated arsenic in a trawl designed for organotins)
 Easy to overlook consequences of any P02 rationalisation
 P02 list can be tested against P35 (both map to P01)
EMODNET Chemical Filter
 Alternative approach
 Capture P01 codes through data mining
 Translate P35 into a list of P01 codes
 Do the chemical filter on the basis of P01 rather
than P02
P01 Model Exposure Status
 Both ODIP and EMODNET require access to the
factored semantic model that underpins P01
 Strong pressure from ODIP (primarily Simon Cox) for
this to be delivered through RDF-XML
 For this every element of the factorisation requires a
URI
 This requires that every element to be covered by a
controlled vocabulary
P01 Model Exposure Status
 The biological entity in the factorisation is already covered (S25
vocabulary)
 Parameter - matrix relationship already covered (S02 vocabulary)
 Currently working on the matrix entity
 Concepts like 'water body particulate >0.2um phase'
 Taking longer than expected (part-time working, EMODNET, IMOS
vocabulary demands, past misdemeanours)
 But getting very close
 Then we just need the parameter entity
Concept Deprecation
 Many SeaDataNet vocabularies have evolved, with concepts
added to satisfy specific demands
 Governance explicitly prohibits deletion
 This leads to issues
 Unintended duplicates


Cause confusion
Unnecessarily complicate aggregation
 Variable granularity



Discovery made more difficult (too many terms)
Patchy domain coverage
Unnecessarily complicates metadata markup
Concept Deprecation
 NVS 1.0 handled deprecation poorly (URI changed)
 Issues addressed in NVS 2.0
 All payload documents include
 skos:note element set to 'accepted' or 'deprecated'
 owl:deprecated boolean element
 Deprecated concept documents also have a dc:isReplaced By
element
 Full controlled vocabulary requests may be designated
'accepted', 'deprecated' or 'all' (default)
Concept Deprecation
 Concept deprecation causes issues for the SeaDataNet
architecture
 Deprecated concepts contained in SeaDataNet
metadatabases
 Deprecated concepts in SeaDataNet filestock
 Consequently, much needed vocabulary
improvements (P03, P02) held back due to concern
about the consequences
Concept Deprecation
 Following deprecation support is needed:
 Deprecation handling within the SeaDataNet
vocabulary client, which could either


Only display accepted concepts (easy to implement)
Flag the deprecated concepts (more work but a better result)
 Automatic parameter substitution in metadatabase file
and data file import tools
 Metadatabase sweepers (run regularly to clean up any
concepts that have been deprecated since ingestion)