Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Linked Data in SDI Simon Cox1, Sven Schade1 and Clemens Portele2 1 European Commission - Joint Research Centre, Institute for Environment and Sustainability, Ispra, Italy {sven.schade, simon.cox}@jrc.ec.europa.eu 2 interactive instruments, Bonn, Germany [email protected] Spatial Data Infrastructures (SDIs) such as INSPIRE are being planned and built using discovery, access and processing components based on a services model (INSPIRE 2008). While the principle of distribution and delegation using the internet is a major step forward from traditional data warehouses and private collections, the query-oriented interaction paradigm is merely evolutionary compared with traditional access systems designed for expert users. In contrast, the success and scalability of the world wide web has been based on hypertext, in which browsing is the key mode of interaction, supported by Universal Resource Identifiers (URIs) (W3C 2001). Linked Data has been proposed as the bridge from the browse-able web to the deep web of technical data (Bizer 2009). Linked Data is still based on web-pages (usually HTML) for user interactions, but supported by Resource Description Framework (RDF) (Lassilia 1999) for richer link semantics. The graph structure that underlies RDF can be encoded in multiple ways, where XML encoding (RDF/XML) it most common. Links can resolve to datasets in legacy file formats which thus serve as leaf-nodes, but can also be part of the web of resources. Key standards used in SDI were designed on Linked Data principles, even before the name was coined. For instance, Geography Markup Language (GML) (OGC 2007) is essentially an RDF/XML application. Thus, in principle, SDIs should integrate seamlessly into the web of linked data (Schade submitted, 2010). There are, however, a number of issues to consider or resolve in order to bring this about: While GML is almost identical to RDF/XML, the deviations (xlink:href instead of rdf:resource; gml:identifier instead of rdf:about) are a barrier to use by mainstream web clients, including crawlers and indexers. In order to be integrated into the broader web, SDI services must make various representations of their (geospatial) resources available for clients that require them, including HTML, KML, GeoRSS, and Geo SiteMap. In most cases this can be accomplished by relatively simple transformations and aggregations of GML. A similar story applies to metadata. The SDI community has paid a great deal of attention to the preparation of rich, provider-driven metadata for both geospatial data and services. Again, INSPIRE serves one example (INSPIRE 2008b). However, the benefit of these in the context of the mainstream web is undermined by use of arcane formats, with almost no client applications. Nevertheless, the formalization is compatible with RDF, so making alternative representations available should be straightforward. While GML supports linking to property values and between features 1, few spatial data sets and – as a result – few applications generating GML make use of them. De-normalized data in standalone records is the most common form. Without leveraging the linking capability, geospatial data and metadata will not be any more useful than other opaque pre-web file formats. A pre-requisite for routine use of links are stable identifiers for items within datasets, to be used as link targets in normalized representations. While most statutory data custodians have identifier schemes for feature instances, these are not generally mapped into the URI syntax required by the web environment. Furthermore, other key resources that should logically be links (definitions, reference systems, organizations, etc.) often do not have well governed identifiers at all. Link traversal in the web environment relies on the HTTP URI scheme (W3C 2001). SDI standards have until now used a mixture of HTTP URIs and other URI types, in particular URNs. For compatibility with the broader web, in particular to leverage the Domain Name System (DNS) and HTTP protocol instead of having to create and maintain a separate resolver system (Mockapetris 1987), SDI deployments must move to ubiquitous use of HTTP URIs for resource identification. 1 We follow the ISO definition of a ‘feature’ being an abstraction of real world phenomena. Making SDI resources compatible with the mainstream web using linked data does not mean that we abandon the clear requirement for specialized geospatial processing applications when required, but augments this to take advantage of tools and audiences coming from the mainstream. In turn, the head-start that we have through our early adoption of Linked Data principles in GML positions the SDI community as exemplary amongst technical data custodians. But we must be willing to adapt our SDI architectures and technologies to harmonize with the mainstream. Currently, we address the listed issues bottom-up, by promoting the use of HTTP URIs in the context of GML and by highlighting the importance of URI governance. Furthermore, we plan to address the issue of multiple geospatial data representations using the technique of content negotiation (Holtman 1998). References Bizer, C.. The Emerging Web of Linked Data. IEEE Intelligent Systems, 24(5): 87-92, 2009. Holtman, K. and A. Mutz, Transparent Content Negotiation in HTTP. Internet Engineering Task Force (IETF) Memo – RFC 2295, 1998. INSPIRE, Network Services Architecture – Version 3.0. 2008. INSPIRE, INSPIRE Metadata Regulation. 2008b. Lassilia, O. and R. Swick, Resource Description Framework (RDF) Model and Syntax Specification. Retrieved (01.02.2010) from http://www.w3.org/TR/PR-rdf-syntax/, 1999. Mockapetris, Domain Names – Implementation and Specification. Retrieved (01.02.2010) from http://tools.ietf.org/html/rfc1035, 1987. OGC, OpenGIS Geography Markup Language (GML) Encoding Standard Version 3.2.1. The Open Geospatial Consortium, 2007. Schade, S. and S. Cox, Linked Data in SDI or How GML is not about Trees. submitted, 2010. W3C, URIs, URLs, and URNs: Clarifications and Recommendations 1.0. Retrieved (01.02.2010) from http://www.w3.org/TR/uri-clarification/, 2001.