Download PowerPoint

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
http://www.ukoln.ac.uk/
A centre of expertise in digital information management
Metadata-based Discovery:
Experience in Crystallography
Monica Duke
[email protected]
UKOLN, University of Bath, UK
UKOLN is supported by:
Discovery in the curation life-cycle
“Digital Curation itself is the active management
of data over the life-cycle of scholarly and
scientific interest; it is the key to reproducibility
and re-use. Metadata for resource discovery
and retrieval are important, with mark-up on
time/place referencing as well as subject
description and linkage to discipline based
ontologies providing key research foci.”
Chris Rusbridge et al.
http://www.dcc.ac.uk/docs/publications/DCC_Sardinia_paper_final.pdf
Digital Library Infrastructures

Historically, cross-search and discovery
protocols an area of interest and research

Z39.50 perceived to have barriers/limitations

OAI-PMH developed using a harvesting model

http://www.openarchives.org/
The OAI-PMH
Data providers
Harvesting
based on
OAI-PMH
Service providers
The OAI-PMH





OAI Protocol for Metadata Harvesting
simple protocol for sharing metadata records between
applications
currently at version 2.0
based on HTTP, XML, XML Schema and XML
namespaces
allows a harvester to ask a remote repository for some
or all of its metadata records

where ‘some’ is based on date-stamps, sets, metadata
formats
Metadata in the eBank UK project

Simple Dublin Core www.dublincore.org

Intended for resource discovery

Compatible with OAI-PMH

Qualified to specify ‘vocabularies’

Refinements: aid interpretation of element value

E.g. <dc:subject

“Dumbing-down” principle applies
xml:lang="en">seafood</dc:subject>
Metadata terms

Creator

Rights

Date

Type

Identifier

Subject
InChI
ChemicalFormula
<dc:subject
xsi:type="ebankterms:CompoundClass">
Organic</dc:subject>
Specified using XML schema and documented using an
Application Profile
http://www.rdn.ac.uk/oai/ebank/20060310/ebank_dc.xsd
http://www.ukoln.ac.uk/projects/ebank-uk/schemas/profile/
Information sources for Crystallography


Cross-discipline sources

OAIster

DAREnet
Discipline-specific

ChemRefer

Chemistry Central

Crystallography Open Database

Reciprocal Net
Texts/publications,
chemistry general
Data,
crystallography
The discovery landscape



Some within OAI-PMH infrastructure (metadatabased)
Variety of (human) search interfaces (simple to
advanced)
Well established sources

Cambridge Structural Database

Protein Data Bank
OAIster


An OAI-PMH aggregator
Wide-ranging and inclusive: Any repository, all
content types

Metadata from 675 institutions

Limit by resource type inc. datasets (5 results)

Pointers to collections of data

2000+ records for ‘crystallography’

Results spread across several sources
OAIster
http://www.oaister.org/
DAREnet


www.darenet.nl
Worldwide access to Dutch academic research
results

Simple search: “crystallography” (40 results)

General advanced search (author, year)
DAREnet
DAREnet
ChemRefer



http://www.chemrefer.com
Access to full text chemical, pharmaceutical
literature Index
Simple search interface
ChemRefer
ChemRefer display of results
Chemistry Central

No search feature (through Biomed central)
Crystallography Open Database (COD)

www.crystallography.net

Promotes open data

Allows submission

‘REF’ format also used

40K entries
COD
Reciprocal Net

A distributed crystallography network for
researchers, students and the general public

Search engine http://www.reciprocalnet.org/recipnet/search.jsp

Crystallography-specific search interface
Reciprocal Net Search Interface
Dataset result in Reciprocal Net
Joining up the landscape

Technical infrastructure differences can be
overcome

Agreement on common APIs, metadata sets

Hide API differences from user

Survey in one application area – how similar are
other disciplines?
Issues with cross-search


Audiences

Who are the user groups?

What are their information needs?
Selection


Identifying subsets of interest
Human Interface design

Search options

Presentation of heterogenous information