Download Data Sharing: Perspective from the National Institutes of Health

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Data Sharing: Perspective from the National Institutes of Health
Dr Belinda Seto, Deputy Director, National Institutes of Biomedical Imaging and
Bioengineering, National Institutes of Health, USA
The National Institutes of Health (NIH) implemented policy on data sharing in 2003. The
policy reaffirmed the principle that data should be made as widely and freely available as
possible, while safeguarding the privacy of research participants and protecting confidential
and proprietary data. Restricted availability of unique resources on which further studies are
dependent can impede the advancement of research and the delivery of medical care.
Therefore, research data supported with NIH funds should be made readily available for
research purposes to qualified individuals within the scientific community.
The NIH data-sharing policy expects timely release and sharing of final research data for use
by other researchers. Grant applicants are expected to include a plan for sharing data or to
state why data sharing is not possible, especially if $500,000 or more of direct cost is
requested from the NIH in any single year. Generally data is shared in the form of
publications. While it did not specify a timeline for sharing data, the NIH policy expects
researchers to share data no later than the acceptance for publication of the main findings
from the dataset. Data can also be shared under a data use agreement or by placements in
public archives and, for sensitive data, by placing these in restricted access data centres or
data enclaves.
How can this policy be reconciled with privacy laws and concerns?
The NIH strongly
upholds the importance of individual privacy and data confidentiality and offers caveats for
sharing data involving human research participants: 1) keep data secure, and 2) de-identify
data. The current U.S. medical privacy rule (HIPPA) includes 18 identifying information or
data elements such as name, social security number, and health plan beneficiary number. In
order to de-identify health information, researchers may completely eliminate all 18
identifying elements or statistically de-identify information such that there is a very small risk
that the information could be sued to identify the subject.
To advance the data-sharing policy, the NIH has launched initiatives to develop informatics
tools to overcome barriers in the fundamental differences in databases and informatics
infrastructures. To the extent that commonalities can be implemented and data and tools
shared, subsequent studies and secondary analyses can be initiated more quickly.
Furthermore, access to databases and data mining require more user-friendly informatics
tools. Approaches that combine images, genomic, gene expression, and patient medical
records will ultimately deliver patient-specific information at a time and place where clinical
decisions are made regarding risk, diagnosis, treatment, and follow-up. The overall strategy
involves the development and standardised validation of application-specific software for
integration and knowledge extraction of heterogeneous clinically relevant data.