Download Spatial Linked Data Infrastructure - INSPIRE

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Linked Data in SDI
Simon Cox1, Sven Schade1 and Clemens Portele2
1
European Commission - Joint Research Centre,
Institute for Environment and Sustainability, Ispra, Italy
{sven.schade, simon.cox}@jrc.ec.europa.eu
2
interactive instruments, Bonn, Germany
[email protected]
Spatial Data Infrastructures (SDIs) such as INSPIRE are being planned and
built using discovery, access and processing components based on a services model (INSPIRE 2008). While the principle of distribution and delegation using the internet is a major step forward from traditional data warehouses and private collections, the query-oriented interaction paradigm is
merely evolutionary compared with traditional access systems designed for
expert users.
In contrast, the success and scalability of the world wide web has been
based on hypertext, in which browsing is the key mode of interaction, supported by Universal Resource Identifiers (URIs) (W3C 2001). Linked Data
has been proposed as the bridge from the browse-able web to the deep web
of technical data (Bizer 2009). Linked Data is still based on web-pages
(usually HTML) for user interactions, but supported by Resource Description Framework (RDF) (Lassilia 1999) for richer link semantics. The graph
structure that underlies RDF can be encoded in multiple ways, where XML
encoding (RDF/XML) it most common. Links can resolve to datasets in
legacy file formats which thus serve as leaf-nodes, but can also be part of
the web of resources.
Key standards used in SDI were designed on Linked Data principles, even
before the name was coined. For instance, Geography Markup Language
(GML) (OGC 2007) is essentially an RDF/XML application. Thus, in principle, SDIs should integrate seamlessly into the web of linked data (Schade
submitted, 2010). There are, however, a number of issues to consider or resolve in order to bring this about:


While GML is almost identical to RDF/XML, the deviations (xlink:href
instead of rdf:resource; gml:identifier instead of rdf:about) are a barrier
to use by mainstream web clients, including crawlers and indexers. In
order to be integrated into the broader web, SDI services must make various representations of their (geospatial) resources available for clients
that require them, including HTML, KML, GeoRSS, and Geo SiteMap.
In most cases this can be accomplished by relatively simple transformations and aggregations of GML.
A similar story applies to metadata. The SDI community has paid a
great deal of attention to the preparation of rich, provider-driven
metadata for both geospatial data and services. Again, INSPIRE serves
one example (INSPIRE 2008b). However, the benefit of these in the
context of the mainstream web is undermined by use of arcane formats,
with almost no client applications. Nevertheless, the formalization is
compatible with RDF, so making alternative representations available
should be straightforward.
While GML supports linking to property values and between features 1,
few spatial data sets and – as a result – few applications generating
GML make use of them. De-normalized data in standalone records is the
most common form. Without leveraging the linking capability, geospatial data and metadata will not be any more useful than other opaque
pre-web file formats.
A pre-requisite for routine use of links are stable identifiers for items
within datasets, to be used as link targets in normalized representations.
While most statutory data custodians have identifier schemes for feature
instances, these are not generally mapped into the URI syntax required
by the web environment. Furthermore, other key resources that should
logically be links (definitions, reference systems, organizations, etc.) often do not have well governed identifiers at all.
Link traversal in the web environment relies on the HTTP URI scheme
(W3C 2001). SDI standards have until now used a mixture of HTTP
URIs and other URI types, in particular URNs. For compatibility with
the broader web, in particular to leverage the Domain Name System
(DNS) and HTTP protocol instead of having to create and maintain a
separate resolver system (Mockapetris 1987), SDI deployments must
move to ubiquitous use of HTTP URIs for resource identification.



1
We follow the ISO definition of a ‘feature’ being an abstraction of real world phenomena.
Making SDI resources compatible with the mainstream web using linked
data does not mean that we abandon the clear requirement for specialized
geospatial processing applications when required, but augments this to take
advantage of tools and audiences coming from the mainstream. In turn, the
head-start that we have through our early adoption of Linked Data principles in GML positions the SDI community as exemplary amongst technical
data custodians. But we must be willing to adapt our SDI architectures and
technologies to harmonize with the mainstream.
Currently, we address the listed issues bottom-up, by promoting the use of
HTTP URIs in the context of GML and by highlighting the importance of
URI governance. Furthermore, we plan to address the issue of multiple geospatial data representations using the technique of content negotiation
(Holtman 1998).
References
Bizer, C.. The Emerging Web of Linked Data. IEEE Intelligent Systems,
24(5): 87-92, 2009.
Holtman, K. and A. Mutz, Transparent Content Negotiation in HTTP. Internet Engineering Task Force (IETF) Memo – RFC 2295, 1998.
INSPIRE, Network Services Architecture – Version 3.0. 2008.
INSPIRE, INSPIRE Metadata Regulation. 2008b.
Lassilia, O. and R. Swick, Resource Description Framework (RDF) Model
and
Syntax
Specification.
Retrieved
(01.02.2010)
from
http://www.w3.org/TR/PR-rdf-syntax/, 1999.
Mockapetris, Domain Names – Implementation and Specification. Retrieved (01.02.2010) from http://tools.ietf.org/html/rfc1035, 1987.
OGC, OpenGIS Geography Markup Language (GML) Encoding Standard Version 3.2.1. The Open Geospatial Consortium, 2007.
Schade, S. and S. Cox, Linked Data in SDI or How GML is not about Trees.
submitted, 2010.
W3C, URIs, URLs, and URNs: Clarifications and Recommendations 1.0.
Retrieved (01.02.2010) from http://www.w3.org/TR/uri-clarification/, 2001.