Download PREMIS - Preservation Metadata: Implementation Strategies

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
PREMIS Data Dictionary
and the Future of Preservation Metadata
Brian Lavoie
Research Scientist
OCLC Research
[email protected]
Society of American Archivists
Washington, D.C.
August 5, 2006
Preservation Metadata
“Information that supports and documents
the digital preservation process”
Provenance
What IPR must be
observed?
Who has had custody/ownership
of the digital object?
Rights
Mgmt.
Authenticity
PRESERVATION
METADATA
What is needed to render
and use the digital object?
Technical
Environment
Preservation
Activity
Is the digital object
what it purports to
be?
What has been done to
preserve the digital object?
Why is preservation metadata important?

Digital objects are technology-dependent …
Complex technical environment between content and user
• Means to access and use archived object must be documented
• Technical metadata especially important
•

Digital objects are mutable …
Can be easily altered, impacting look, feel, functionality
• Changes to object must be documented/validated
• Provenance/authenticity metadata especially important
•

Digital objects are bound by intellectual property rights …
Preservation often proceeds while copyright still in effect
• May constrain preservation activities and access policies
• Rights management metadata especially important
•

Makes digital objects self-documenting across time
PREMIS Working Group
 “Early days” … various preservation metadata element sets
released
Different scopes, purposes, underlying models/assumptions
• No international standard; little consolidation of expertise/best
practice
•
 June 2003: OCLC, RLG sponsored international working group:
•
PREMIS: Preservation Metadata: Implementation Strategies
 Objective:
•
Define implementable, core preservation metadata, with
guidelines/recommendations for management and use
 Membership:
•
> 30 experts from 5 countries, libraries, museums, archives,
government agencies, private sector
PREMIS Data Dictionary

May 2005: Data Dictionary for Preservation
Metadata: Final Report of the PREMIS Working Group

237-page report includes:
PREMIS Data Dictionary 1.0
• Context/assumptions, data model, usage examples
•

Set of XML schema to support implementation

Data Dictionary:
Comprehensive view of information needed to support digital preservation
• Based on deep pool of institutional experiences in setting up and managing
operational capacity for digital preservation
•
http://www.oclc.org/research/projects/pmwg/premis-final.pdf
2005 British Conservation
Awards: Digital Preservation
Award
2006 Society of American
Archivists Preservation
Publication Award
Some guiding principles …
 “Implementable, core, preservation metadata”:
•
•
•
“Preservation metadata”: maintain viability, renderability,
understandability, authenticity, identity in a preservation context
“Core”: What most preservation repositories need to know to
preserve digital materials over the long-term
“Implementable”: rigorously defined; supported by usage
guidelines/recommendations; emphasis on automated workflows
 “Technical neutrality”:
Digital archiving system: no assumptions about specific archiving
technology, system/DB architectures, preservation strategy
• Metadata management: no assumptions about whether metadata
is stored locally or in external registry; recorded explicitly or
known implicitly; instantiated in one metadata element or
multiple elements
• Promotes flexibility, applicability in wide range of contexts
•
Sample Data Dictionary entry
Semantic unit
Semantic
components
Definition
Rationale
Data constraint
Object category
Applicability
Examples
Repeatability
Obligation
Creation/
Maintenance notes
Usage notes
size
None
The size in bytes of the file or bitstream stored in the
repository.
Size is useful for ensuring the correct number of bytes from
storage have been retrieved and that an application has
enough room to move or process files. It might also be used
when billing for storage.
Integer
Representation
File
Bitstream
Not applicable
Applicable
Applicable
2038927
Not repeatable
Not repeatable
Optional
Optional
Automatically obtained by the repository.
Defining this semantic unit as size in bytes makes it
unnecessary to record a unit of measurement. However, for
the purpose of data exchange the unit of measurement should
be stated or understood by both partners.
PREMIS Maintenance Activity
 Web site:
Permanent Web presence, hosted by
Library of Congress
• Central destination for PREMIS-related
info, announcements, resources
• Home of the PREMIS Implementers’ Group (PIG) discussion list
•
 PREMIS Editorial Committee:
•
•
•
•
Set directions/priorities for PREMIS development
Coordinate future revisions of Data Dictionary and XML schema
Membership: Library of Congress, OCLC, FCLA, National
Archives of Scotland, British Library, National Library of
Australia, U. of Goettingen, LANL, (two more seats still TBD)
Will convene late August/early September
http://www.loc.gov/standards/premis/
Current activities
 Documenting errata and proposed revisions to Data
Dictionary (feedback through PIG list)
•
http://www.loc.gov/standards/premis/changes.html
 PREMIS Implementers’ Registry
•
http://www.loc.gov/standards/premis/premis-registry.html
 Consultancies:
Rights issues for digital preservation (Karen Coyle)
• PREMIS implementation guidelines and recommendations
(Deborah Woodyard-Robinson)
•
 PREMIS Workshops:
2-day tutorial on Data Dictionary and implementation issues
• Digital Curation Center PREMIS workshop (July 17-18 Glasgow)
• DLF Forum (Boston, early November)
•
Looking to the Future …

Basic questions (“what type”, “how much”) still unsettled …
Digital preservation processes still not fully tested/understood
• Hard to judge effectiveness a priori
• Important to document and share practical experience
•

Workflows for preservation metadata …
Tools to support automatic generation of preservation metadata
(JHOVE, NLNZ tools)
• Tools should support formal metadata schemas (like PREMIS)
• Registries (PRONOM, GDFR)
•

Harmonization with other initiatives …
Integrate PREMIS with other standards, technologies, best practices
• E.g., Z39.87, METS
• Not just standards, but integrated solutions
•

Division of labor …
•
Efficient strategies for collecting preservation metadata: i.e., WHO
and WHEN (Automatic Exposure project)