Download NCAR/UCAR Data Citation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Low-voltage differential signaling wikipedia , lookup

IEEE 1355 wikipedia , lookup

UniPro protocol stack wikipedia , lookup

Transcript
UCAR Workshop Review –
“Bridging Data Lifecycles:
Tracking Data Use via Data Citations”
Matt Mayernik
Research Data Service Specialist
NCAR Library/Integrated Information Services (IIS)
National Center for Atmospheric Research (NCAR)
University Corporation for Atmospheric Research (UCAR)
BESSIG, April 18, 2012
Workshop
• April 5-6, at UCAR Center Green Campus
• Funded by NOAA through the UCAR JOSS program
• ~80 attendees
–
–
–
–
Academic librarians
Data management professionals
Software engineers
Scientists
• Agenda and presentations posted at
http://library.ucar.edu/data_workshop/
2
What is a Data Citation?
Citation to journal article
Citation to data set
From:
Patil, S. and M. Stieglitz. 2011. Hydrologic similarity among catchments under variable flow
conditions. Hydrology and Earth System Sciences, 15, 989–997. doi: 10.5194/hess-15-989-2011
3
Interest in Data Citations
NSF GEO
issued a “Dear
Colleague
Letter” on
March 29
4
NCAR/UCAR Data
Climate model
output data
Longitudinal
time-series data
All images: copyright University Corporation for
Atmospheric Research
Observational data from
field studies
5
Motivation for Data Citations
• Understand use and impact of data
– Measurements of data use
– Give scientists and data centers credit for producing,
managing, and curating data
– Metrics requirements as an FFRDC
• Connecting data and scholarship
• Increase transparency of data and science
6
Mark Parsons
7
Data Citation Practices
• Most data users
don’t cite data
• Ex. “MODIS snow
cover data”
from NSIDC
From:
Parsons, M. A., Duerr, R., and Minster, J.-B. 2010. Data Citation and Peer Review. Eos
Transactions, AGU, 91(34): 297-298. http://dx.doi.org/10.1029/2010EO340001
8
Mark Parsons
Hypothesis: ~80% of citation scenarios for 80% of ESS data
9
Joan Starr
EZID: long-term identifiers made easy
take control of the
management and
distribution of your research,
share and get credit for it,
and build your reputation
through its collection and
documentation
Primary Functions
1. Create persistent identifiers
2. Manage identifiers over time
3. Manage associated metadata over time
Joan Starr
DOIs
vs
• Established brand in
publishing
• Indexed by major A&I
citation databases
• Cannot be deleted
• More costly
• Ex.
ARKs
• Case sensitive
• Special feature
supports granularity
• Informative
• Less costly
• Ex.
http://n2t.net/ark:/b5065/d6wd3xh5
http://dx.doi.org/10.5065/D6WD3XH5
Both resolve to:
http://www.ncl.ucar.edu
Bill Cook
Excerpts from existing AGU policy –
Citing Data
..data cited in AGU publications must be permanently
archived in a data center or centers that meet the
following conditions:
• are open to scientists throughout the world.
• are committed to archiving data sets indefinitely.
• provide services at reasonable costs.
Data sets that are available only from the author,
through miscellaneous public network services, or
academic, government or commercial institutions not
chartered specifically for archiving data, may not be cited
in AGU publications.
Bill Cook
Excerpts from existing AGU policy –
Preserving/Archiving Data
AGU does not expect to archive data sets subject to this
policy, except on a for-fee basis and for sets of a small size
It is not AGU's intention to serve as an archive for large
data sets that should be housed in data centers.
AGU maintains a deposit service for supplementary
material of different types in order to provide long-term
access to small supporting data sets and graphics files
that are published concurrently with, and are an
electronic component of, some AGU journal articles.
NCAR Data Citation Initiatives
1. Technical
2. Policy/procedural
Image copyright University Corporation for
Atmospheric Research
14
Citation Challenges
1. Diversity
2. Granularity
3. Version Control
4. Maintenance Over Time
15
Mike Daniels
What granularity for EOL DOIs and
when are they issued?
• Given a large project with aircraft, soundings, radars, model
output and satellite data do we:
–
–
–
–
Assign a DOI for each data file?
Assign one DOI for all datasets for the project?
Assign separate DOIs for datasets from each major platform?
What about ancillary data? Do we assign DOIs or does the providing
institution?
• We are thinking to assign DOIs for each major platform data
associated with the project (e.g. C-130, S-Pol), outside
datasets that we have “value-added”, and data for which no
DOI exists
• It may be beneficial to only issue DOIs when processed data
are released so as to prevent pubs from referencing
preliminary data
Gary Strand
Data QC
Nicole Kaplan
The LTER NIS 2000
K.S. Baker, B.J. Benson, D.L. Henshaw, D. Blodgett, J.H. Porter, S.G. Stafford. (2000)
Evolution of a Multisite Network Information System: The LTER Information Management
Paradigm. BioScience. 50(11) 963-978.Nicole Kaplan, CSU - Long-Term Management of Ecological Data - April 2012, UCAR
Nicole Kaplan
The LTER NIS 2011
Nicole Kaplan, CSU - Long-Term Management of Ecological Data - April 2012, UCAR
Barb Losoff
Results of CU Faculty Survey About Data
Curation
• Many researchers had curation plans for their data
• Many had orphan data without curation plans
• Few departments had procedure for data preservation, some
participated in disciplinary based repositories supporting
long-term storage
• Receptivity to a library role in data curation fell more in-line
with the researchers disciplinary culture or philosophy
regarding data sharing and collaborative projects.
Ruth Duerr
21
Lynn Yarmey
22
Ted Habermann
Citations in the Bigger Picture
Ted Habermann,
NOAA/NESDIS/NGDC, NASA/ESDIS
Data preservation is
communicating with the
future
Ted Habermann
Metadata Types and Sharing
User
Discovery Portal
User
Community
Metadata
Collections
Discovery
Use / Mashup
Understanding
More documentation is required for understanding data than
discovering or using it.
Tim Killeen
25
Steve Worley
Current Practices @ NCAR’s Research Data Archive
Metrics Usage - Sample
37% of Users
are from US
Now exporting
25+ TB monthly
Track User activity:
- who accessed what and when
Subsetting, in
general, is +500
requests/month
26
Bridging Data Lifecycles, April 5-6, 2012
Dan Kowal
Annual Reporting Example
- 294,337 visits (browser/user only)
- 14,658 unique visitors
- 9.27 pages/visit
- 6:45 avg. duration
Most Accessed out of 28 Data Sets:
* SPIDR NODES
Dan Kowal, Data Administrator
Leonard Sitongia
NCAR Mauna Loa Solar Observatory Pubs.
28
Steve Worley
Dataset Family Tree Example
Global and Regional Atmospheric and Ocean Re-analyses
NCEP/NCAR, NARR, ERA-40, ERA-Interim, 20CR, OARCA
NOC Surf. Flux
(1973-2009)
WASwind
(1950-2009)
Etc.
Ocean Clouds
(1900-2010)
JMA SST
(1871-2011)
HadSLP
(1871-2011)
HadISST
(1871-2011)
NOAA OI SST
(1981-2011)
NOAA ERSST
(1854-2011)
International Comprehensive Ocean Atmosphere Data Set (ICOADS)
Global marine surface observations (1662-2011)
29
How to Get Started
•
•
•
•
•
Know what you want to achieve
Know your identifier options
Engage stakeholders
Start with well-bounded cases
Plan for the long-term implications
– How to maintain
– How to count
30
Thank You
Workshop agenda and presentations:
http://library.ucar.edu/data_workshop/
Email:
[email protected]
31
END
32