Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Open Database Connectivity wikipedia , lookup
Oracle Database wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Relational model wikipedia , lookup
Clusterpoint wikipedia , lookup
CS 502: Computing Methods for Digital Libraries Lecture 4 Identifiers and Reference Links 1 Desirable Properties of Identifiers • • • • • • • 2 Location independent name Globally unique Persistent across time Choice of human generated or automatic generation Fast resolution Decentralized administration Supported from standard user interfaces Syntax of Handles Syntax <naming_authority>/<locally_unique_string> or hdl:<naming_authority>/<locally_unique_string> Examples 10.1234/1995.02.12.16.42.21;9 cornell.cs/cstr-94.45 loc/a43v-8940cgr 3 (date-time stamp) (mnemonic name) (random string) Examples of DOIs Publisher ID assigned by DOI Agency Item ID assigned by Publisher 10.1048 / 872 10.156 / catalog-96 10.1532 / PII 10.18698 / SICI 4 Elements of the Handle System • Handle services: global handle service local handle services caching services • Clients: client libraries browser extension WWW proxy servers • Handle administration • System utilities 5 Hierarchy of Naming Authorities loc 10 cornell 10.1234 loc.cords 6 cornell.cs cornell.temp cornell.cs.d Handle Servers and Handle Service • The Global Handle Service provides central coordination for all handle services. • Each naming authority has a home handle service (which may be Global) where its handles are maintained. • Each handle service may be implemented as several handle servers. • A hashing algorithm determines the server used to store a given handle. 7 Handle Record for a Digital Object cnri.dlib/arms-09 Adm Admin Data Adm Admin Data URL http://www.cnri/xyz RAP merlin.dlib.org NEW orb:#cornell[]norb 8 Address Rules The Global Handle Service stores: a record for each naming authority a record for each local handle service The record for each naming authority includes: the home handle service for that naming authority For each handle, the home handle service stores: the handle record 9 Resolving a Handle Without Caches Handle cnri.dlib/wya in Global G ? cnri.dlib/wya ? Client handle data G Global cnri.dlib/wya 10 Resolving a Handle Without Caches Handle cnri.dlib/wya in Home Service abc ? cnri.dlib/wya ? G Global Client pointer to abc ? cnri.dlib/wya ? handle data 11 abc Home HS for cnri.dlib cnri.dlib/wya Caching Handle Service Client Caching Server Hash Hash table Cache 12 Handle Servers Replication All data is replicated at several sites for performance and reliability Los Angeles, CA 13 Washington, DC Applications of Identifiers The challenges: Persistent, unique identifiers Eliminate broken links Control duplicates Applications: On-line publication Registration Citation (reference links) Collection management Archives 14 DOIs and URNs in Action User DOI Publisher Handle System 15 Flexibility for Publisher Every publisher can have a different system. Database DOI DOI DOI Warehouse 16 Repository Reorganization by Publisher The publisher can create a new system. DOI Database DOI DOI Repositories 17 Change of Publisher User DOI Halfmoon Millenium Handle System 18 Citation User 1 DOI Publisher User 2 DOI 19 Handle System Catalogs and Indexes User DOI Search System Publisher Handle System 20 Copyright Registration User Copyright Registry DOI Halfmoon Handle System 21 Multiple Copies User DOI Halfmoon Europe Halfmoon USA Handle System 22 Archives User DOI Archive Handle System 23 Reference Linking: The Problem Generic Given the information in a standard citation, how does one get to the thing to which the citation refers? Specific Given the information in a citation to a journal article, how does a user get from the citation to an appropriate copy of the article? 24 The General Model Publisher Reference database Location database Client 25 Publisher places information in databases Content The General Model Publisher Reference database Location database Identifiers Citation Client 26 Content The General Model Publisher Reference database Location database Identifier URLs Client 27 Content The General Model Publisher Reference database Location database Content URL Content Client 28 The General Model Publisher Reference database Identifiers Citation Location database URL Identifier URLs Client 29 Content Content Target of Citations IFLA model • • • • Work Expression Manifestation Item Citations can refer to any specific creation but for journals usually refer to the work. 30 Identifiers • Are identifiers necessary? – Persistence – Flexible targets • Examples: – PubMed ID, BibCode, DOI, etc. 31 How are Identifiers Obtained? Often the client knows the citation, but not the identifier. • In the general model identifiers are obtained by searching the reference database. • In limited domains, identifiers can be calculated from metadata. • The identifier may be embedded in the citation. 32 Reference Database Lookup • Static: Reference links are established once for all time. – Current model in journal publishing – Not suitable for general user queries • Dynamic: Reference links are established on demand. – Provides link based on most recent information – Success can not be guaranteed Quality of metadata in reference database(s) is crucial. 33 Metadata in Reference Database • Existing schemes – Considerable agreement on minimal elements – Considerable differences in details and syntax 34 Minimal Metadata Elements for Journal Article • • • • • • • 35 Title of journal article Creator(s) Journal title Date of publication Enumeration (e.g., volume and issue) Location (e.g., page or article number) Type (e.g., "journal article") Resolution of Identifier • Choice of resolver (distributed resolution) – Simple model: identifier determines resolver • Selection from multiple copies (selective resolution) – Performance criteria – Economic and related criteria – User requirements 36 Interoperability Several reference linking services under development: PubMed Astrophysics Data Center DOI reference service Los Alamos National Laboratory internal reference service What levels of agreement and tools are needed for crosslinking? 37