Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Approaches To The Validation Of Dublin Core Metadata Embedded In (X)HTML Documents Background Dublin Core is the name given to a standard set of core metadata elements used for resource discovery. Metadata has an important role to play in many digital library applications. The Dublin Core standard has been widely adopted in many digital library applications. The Problem Lack of compliance with standards is well-known in Web applications, particularly with HTML. Despite the availability of a range of HTML validation tools, these do not appear to be widely used and many Web authors appear to check their documents simply by viewing in Web browsers. There is a danger that Dublin Core metadata embedded in HTML will fail to comply with standards – a possibly which is more likely due to the lack of a visual display of Dublin core metadata. A centre of expertise in digital information management A Simple Approach To Validation Use of DC-dot DC-dot is a popular Web-based tool for creating and managing Dublin Core metadata. DC-dot can also be used to carry out simple validation of Dublin Core embedded in HTML resources. Limitations of DC-dot DC-dot has several limitations: • It only performs basic validation • It was not designed primarily as a validation tool • It cannot be easily extended (e.g. Applied with other application profiles) Findings Use of DC-dot across a digital library programme showed that the entry points contained various errors in the representation of Dublin Core: • Use of DC.Author rather than DC.Creator • Incorrect format of date field • Incorrect use of delimiters A centre of expertise in digital information management Using An RDF Validator Use of An RDF Validator An alternative tested was to make use of W3C's online Dublin Core to RDF XLST transformation service and the RDF validator. This approach made use of several online services which were chained together: • Tidy to convert project home page to XHTML format • Dublin Core to RDF XLST transformation service to convert embedded Dublin Core elements to RDF format • RDF validation service to validate the RDF format Comments This approach helped by providing a visual display of the Dublin Core metadata. It was noticed, for example, that one page contained an invalid identifier: http:/www.foo.ac.uk/... rather than http://www.foo.ac.uk/... However since the RDF validation service has no understanding of the semantics of the Dublin Core metadata, this approach has its limitations . A centre of expertise in digital information management dcmeta: An XSLT Approach Use of XSLT We have pioneered use of XSLT to provide validation of Dublin Core metadata embedded in HTML resources. The XSLT approach: • Creates a report on DC metadata embedded in an XHTML document • Is designed with knowledge of the Dublin Core semantics by checking against an application profile of the DC Metadata Element Set. The profile is a set of rules which specify: • Permitted DC properties (e.g. only the 15 core DC elements are allowed) • Minimum/maximum permitted occurrences of a specified property (e.g. only one occurrence of DC.Title permitted) • Permitted encoding schemes (e.g. DC.Subject properties should have the scheme "LCSH") • Permitted values (e.g. DC.Publisher must have the value "UKOLN") A centre of expertise in digital information management Conclusions Summary This poster summarises a number of approaches to validating Dublin Core metadata embedded in HTML resources. The poster also describes initial work in the development of an XSLT-based tool for validation. Future Work The XSLT stylesheet is available as open source, and we invite interested parties to develop this work further. Areas in which the tool could be developed include: • Development of the Web interface to the tool • Allowing local rules to be included • Deploying the tool as a bookmarklet • Deploying the tool as a "Web Service" Contact Details For further information please contact Pete Johnston, UKOLN by sending email to <[email protected]> Implementation The service is available at <http://www.ukoln.ac.uk/metadata/dcmeta/> A centre of expertise in digital information management