Request for Information (RFI): Metrics to Assess Value of Biomedical Digital Repositories

Notice Number: NOT-OD-16-133

Key Dates
Release Date: August 12, 2016
Response Date: September 30, 2016

Related Announcements
NOT-OD-17-015
NOT-OD-16-153

Issued by
National Institutes of Health (NIH)

Purpose

This Request for Information (RFI) is to solicit input on metrics to assess the value and impact of biomedical digital data repositories that may provide a basis for technical and science policy actions required to support the long-term sustainability of repositories.

Introduction
Increasing access to digital research data presents significant scientific opportunities to enhance return on investment, expand accountability, and accelerate discovery and progress. To seize these opportunities, data must be managed and shared appropriately; shared data must be citable to make clear their origin and allow the authors of the data to accrue recognition; and the importance of infrastructure, such as data repositories, must be appreciated. Data often must be considered in conjunction with other related digital objects including experimental and analytical workflows, standards, data annotations, and software that act on data. As such, shared data should conform to the FAIR principles, i.e., findable, accessible, interoperable, and reusable (http://www.nature.com/articles/sdata201618).

The goal for NIH data management and sharing is to make publicly-funded data broadly accessible to support reuse, reproducibility and discovery while simultaneously balancing the costs and benefits. The many aspects of the data landscape must be considered in implementation of the new NIH data sharing policies. In addition to the current RFI, an RFI on NIH Data Sharing Strategies will be released in the near future to collect the community's input on these topics.?

Background
As put forth in the NIH Data and Informatics Working Group (DIWG) report to the Advisory Committee of the Director, colossal changes in biomedical research technologies and methods have shifted the bottleneck in scientific productivity from data production to data management, communication, and interpretation. The DIWG report also states that modern interdisciplinary team science requires an infrastructure and set of incentives to promote data sharing, and it needs an environment that fosters the development, dissemination, and effective use of computational tools for the analysis of datasets whose size and complexity have grown by orders of magnitude in recent years. In response to the opportunities and challenges presented by the era of "Big Data" in biomedical research, the NIH launched the Big Data to Knowledge (BD2K) initiative as a trans-NIH initiative to cultivate the digital research enterprise within biomedicine, to facilitate discovery and support new knowledge, and to maximize community engagement.

Digital data repositories represent a common mechanism for managing and storing biomedical content. The repositories enable specific communities to manage and preserve relevant data with the goal of ensuring continued existence and access to the data within the repository for the larger biomedical community. While there is a spectrum of models for content intake and management, biomedical digital data repositories can be thought of in two general categories: 1) Deposition repositories, which support primary research data submitted by the data producers; and 2) Knowledgebases, which provide curated findings derived from the aggregation or analysis of experimental data.

The increasing size and volume of biomedical data has led to increasing demand on biomedical data repositories. As research institutions begin to implement federal policies requiring them to share research data that have been gathered with the support of public funds, data repositories are growing in number, scale and complexity. In this context, it is critical to understand and measure the value that these data repositories, and the individual data types and data sets that they contain, are providing to the research community. This information will support: 1) the ability of repository owners to prioritize activities related to the management of these repositories; 2) decisions by funding agencies which support biomedical data repositories; 3) communication about the usage and value of these repositories.

Information Requested
With this Request for Information (RFI) Notice, the NIH invites interested and knowledgeable persons to inform NIH about existing and desired approaches for measuring and assessing value of biomedical data repositories.

The NIH is seeking information on qualitative and quantitative metrics such as those that describe:

  • Utilization at multiple levels (repository, dataset, data item). In addition to the frequency of access and number of downloads, this might include:
    • Size and measured demand of the community served, placed in the context of the overall field.
    • The ongoing rate of data deposition and data access or download
  • Indicators of data repository quality and impact. Examples include but are not limited to:
    • Publications from the data
    • Data citations
    • Altmetrics
    • Patents
    • Utilization of data sets in research studies
    • Outputs of those research studies, e.g. use in policies or guidelines
    • Enhanced data sharing and community collaboration around annotation/analysis of data sets
    • Economic measures such as investment and use value; efficiency impacts; return on investment
  • Quality of service. Examples may include but are not limited to:
    • Implementation of a rigorous quality assurance process
    • Use of community-recognized standards
    • User support and training
    • Ease of data deposition and retrieval
    • Technical indicators, e.g., uptime, response time
  • Infrastructure and governance. Examples may include but are not limited to:
    • Existence of an independent advisory board
    • Legal structure, e.g., access, security, licensing
    • Long-term sustainability plan
  • Qualitative metrics that may address many of the above categories, such as collection of use cases or case studies
  • Consideration of case studies demonstrating the value of the repository. For example, assessing the questions of:
    • If the repository weren t available, how would that impact your work?
    • What are the data sharing alternatives to the repository?
    • What are the implications of using these alternatives?

All stakeholders with an interest in approaches to measure and assess the value of biomedical data repositories and decisions driven by these metrics are invited to provide information. If you choose, you may categorize your area of expertise by including all that apply:

  • Biomedical science researcher
  • Bioinformatician
  • Data scientist
  • Standard developer or maintainer
  • Research data repository manager
  • Library or information scientist
  • Data curator
  • Funder
  • Publisher
  • Administrator (president, provost, dean or equivalents in academic or non-profit organizations)
  • Tool or services vendor
  • Other

Your response may also include your detailed roles within industry, government, or academia and the history and experiences of relevant repositories.

Submitting a Response
All responses must be submitted to NIH_Repository_Metrics_RFI@mail.nih.gov by September 30, 2016. Please include the Notice number NOT-OD-16-133 in the subject line. Responders are free to address any or all of the categories listed above. The submitted information will be reviewed by NIH staff.

Responses to this RFI are voluntary. Please do not include any proprietary, classified, confidential, or sensitive information in your response. NIH will use information submitted in response to this RFI at its discretion and will not provide comments to any responder's submission. The collected information will be reviewed by NIH staff, may appear in reports, and may be shared publicly on an NIH website.

The Government reserves the right to use any non-proprietary technical information in summaries of the state of the science, and any resultant solicitation(s). The NIH may use information gathered by this RFI to inform development of future funding opportunity announcements.

This RFI is for information and planning purposes only and should not be construed as a solicitation or as an obligation on the part of the Federal Government, the National Institutes of Health (NIH), or individual NIH Institutes and Centers. NIH does not intend to make any awards based on responses to this RFI or to otherwise pay for preparation of any information submitted or for the Government's use of such information. No basis for claims against the U.S. Government shall arise as a result of a response to this request for information or from the Government’s use of such information.

Inquiries

Please direct all inquiries to:

Elizabeth Hsu, PhD, MPH
National Cancer Institute
Telephone: 240-276-5733
Email: NIH_Repository_Metrics_RFI@mail.nih.gov