Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
http://www.ukoln.ac.uk/ A centre of expertise in digital information management Metadata-based Discovery: Experience in Crystallography Monica Duke [email protected] UKOLN, University of Bath, UK UKOLN is supported by: Discovery in the curation life-cycle “Digital Curation itself is the active management of data over the life-cycle of scholarly and scientific interest; it is the key to reproducibility and re-use. Metadata for resource discovery and retrieval are important, with mark-up on time/place referencing as well as subject description and linkage to discipline based ontologies providing key research foci.” Chris Rusbridge et al. http://www.dcc.ac.uk/docs/publications/DCC_Sardinia_paper_final.pdf Digital Library Infrastructures Historically, cross-search and discovery protocols an area of interest and research Z39.50 perceived to have barriers/limitations OAI-PMH developed using a harvesting model http://www.openarchives.org/ The OAI-PMH Data providers Harvesting based on OAI-PMH Service providers The OAI-PMH OAI Protocol for Metadata Harvesting simple protocol for sharing metadata records between applications currently at version 2.0 based on HTTP, XML, XML Schema and XML namespaces allows a harvester to ask a remote repository for some or all of its metadata records where ‘some’ is based on date-stamps, sets, metadata formats Metadata in the eBank UK project Simple Dublin Core www.dublincore.org Intended for resource discovery Compatible with OAI-PMH Qualified to specify ‘vocabularies’ Refinements: aid interpretation of element value E.g. <dc:subject “Dumbing-down” principle applies xml:lang="en">seafood</dc:subject> Metadata terms Creator Rights Date Type Identifier Subject InChI ChemicalFormula <dc:subject xsi:type="ebankterms:CompoundClass"> Organic</dc:subject> Specified using XML schema and documented using an Application Profile http://www.rdn.ac.uk/oai/ebank/20060310/ebank_dc.xsd http://www.ukoln.ac.uk/projects/ebank-uk/schemas/profile/ Information sources for Crystallography Cross-discipline sources OAIster DAREnet Discipline-specific ChemRefer Chemistry Central Crystallography Open Database Reciprocal Net Texts/publications, chemistry general Data, crystallography The discovery landscape Some within OAI-PMH infrastructure (metadatabased) Variety of (human) search interfaces (simple to advanced) Well established sources Cambridge Structural Database Protein Data Bank OAIster An OAI-PMH aggregator Wide-ranging and inclusive: Any repository, all content types Metadata from 675 institutions Limit by resource type inc. datasets (5 results) Pointers to collections of data 2000+ records for ‘crystallography’ Results spread across several sources OAIster http://www.oaister.org/ DAREnet www.darenet.nl Worldwide access to Dutch academic research results Simple search: “crystallography” (40 results) General advanced search (author, year) DAREnet DAREnet ChemRefer http://www.chemrefer.com Access to full text chemical, pharmaceutical literature Index Simple search interface ChemRefer ChemRefer display of results Chemistry Central No search feature (through Biomed central) Crystallography Open Database (COD) www.crystallography.net Promotes open data Allows submission ‘REF’ format also used 40K entries COD Reciprocal Net A distributed crystallography network for researchers, students and the general public Search engine http://www.reciprocalnet.org/recipnet/search.jsp Crystallography-specific search interface Reciprocal Net Search Interface Dataset result in Reciprocal Net Joining up the landscape Technical infrastructure differences can be overcome Agreement on common APIs, metadata sets Hide API differences from user Survey in one application area – how similar are other disciplines? Issues with cross-search Audiences Who are the user groups? What are their information needs? Selection Identifying subsets of interest Human Interface design Search options Presentation of heterogenous information