Download Constructing Data Curation Profiles - Purdue e-Pubs

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Constructing Data Curation Profiles
Michael Witt, Jacob Carlson, D. Scott Brandt
Purdue University
Melissa H. Cragin
University of Illinois at Urbana-Champaign
5th International Digital Curation Conference, London: December 4, 2009
http://datacurationprofiles.org
The Data Curation Profile: In a Nutshell
The Data Curation Profile is an instrument that can be used to
provide concise but detailed information on particular data
forms that might be curated by an academic library. These
data forms are presented in the context of the related subdisciplinary research area, and they provide the flow of the
research process from which these data are generated. The
profiles also represent the needs for data curation from the
perspective of individual data producers, using their own
language. As such, they support the exploration of data
curation across different research domains in real and
practical terms.
Subjects
Purdue
•
•
•
•
•
•
•
•
•
•
•
Biology
Horticulture
Civil Engineering
Electrical & Computer Engineering
Biochemistry
Food Science
Earth & Atmospheric Science
Agronomy
Agronomy
Agronomy
Agronomy
Illinois
•
•
•
•
•
•
•
•
•
•
Kinesiology
Atmospheric Sciences
Speech & Hearing
Soil Science
Anthropology
Anthropology
Anthropology
Geology
Geology
Geology
Interviews
•
•
•
•
IRB approval
Identified subjects
Pre-interview worksheet
Initial interviews: open-ended Interview Guide asking about their
–
–
–
–
–
–
–
Demographics
Research data lifecycle
Data management
Disposition of data
Making their data available
Re-use of data
Thoughts on roles for librarians/libraries in data curation
• Follow-up interviews: examined common themes and gaps from
initial transcripts and followed up with highly structured interview
Follow up Interviews
• Asked the subject to identify and describe a
specific exemplar dataset and workflow(s)
• Utilized a “requirements worksheet” that was
provided to the subject in advance
• We tried to ask the “tough questions” and to
quantify answers
A few of the “tough questions”
• How many years should this dataset be preserved?
• Is your manner of description/organization sufficient for another
person with similar expertise to understand and properly use your
dataset?
• With whom would you share your dataset (nobody, immediate
collaborators, anybody, etc.) and when would you be willing to
share it with them (raw data, corrected data, processed data,
before publication of paper, after publication, etc.)
• What are your priorities for [17 different] potential data services
(e.g., providing citations for your dataset)?
Witt, M. (2009). Eliciting faculty requirements for research data repositories. 4th
International Conference on Open Repositories. http://hdl.handle.net/1853/28509
Developing the Data Curation Profile
• Created three profiles as a proof-of-concept from mining literature
in data projects in astronomy, ecology, and crystallography
• Transcription of interviews
• Transcripts coded using qualitative analysis software (NVivo)
• Two draft Data Curation Profiles were created from coded
transcripts
• Draft profiles provided to six external reviewers for comments
• Feedback incorporated into final template, began creating Data
Curation Profiles
Current Work
• Currently 7 profiles are complete, another 12 are being
produced and will be posted in the coming months
• The Data Curation Profile template will be posted and
instructions will be given to encourage the creation and
contribution of new profiles by others—maybe even by
you?
• Would like to enable more collaboration features on the
wiki (e.g., annotation, threaded discussions) and foster a
growing and continuing resource/venue
• Revise and improve the Data Curation Profile based on
feedback from the community
Example Data Curation Profile:
Traffic Flow
Some questions for discussion:
What uses do you see for Data Curation Profiles?
Are they easy to understand?
What is missing from them?
How can Data Curation Profiles be leveraged to complement other
research and practice taking place in the area of digital curation?
Project Team & Acknowledgements
PI: D. Scott Brandt
Co-PIs: Jacob Carlson, Melissa Cragin, Carole Palmer, Sarah Shreeves,
Michael Witt.
Research assistants: Marina Kogan, Deborah Leiter
External reviewers: Leslie Delserone, Michael Grady, Ron Jantz, Ardys
Kozbial, Reagan Moore, and Brian Westra
Supported by an IMLS National Leadership Grant, LG-06-07-0032-07.