Request for Information (RFI): Input on Information Resources for Data-Related Standards Widely Used in Biomedical Science

Notice Number: NOT-CA-14-054

Key Dates
Release Date: August 27, 2014
Response Date: September 30, 2014

Related Announcements
None

Issued by
National Cancer Institute (NCI)
National Institute of Allergy and Infectious Diseases (NIAID)
National Institute of Biomedical Imaging and Bioengineering (NIBIB)
National Institute on Dental and Craniofacial Research (NIDCR)
National Institute on Drug Abuse (NIDA)
National Institute of General Medical Sciences (NIGMS)
National Institute on Minority Health and Health Disparities (NIMHD)
National Library of Medicine (NLM)
National Center for Advancing Translational Sciences (NCATS)
Office of Strategic Coordination (Common Fund)

Purpose

With this Request for Information (RFI) Notice, the NIH invites comments and ideas from interested persons to inform the consideration of an NIH Standards Information Resource (NSIR) that would collect, organize, and make available to the public trusted, systematically organized, and curated information about data-related standards. This resource would focus on those standards that are widely used in biomedical research and related activities. The main purpose of the NSIR would be to help a variety of biomedical users such as researchers, clinicians, data curators, and informaticians, among others, identify and choose data-related standards that are best suited to their needs. The NSIR is a potential initiative of the NIH Big Data to Knowledge (BD2K) program as part of efforts to facilitate the broad use of biomedical research data. It is envisioned that such data standards themselves may become specific, citable digital objects within the digital research ecosystem, or Commons, as envisioned by the BD2K (see http://www.open-bio.org/w/images/d/dd/Bosc2014-bioinform.pdf ).

Background

On September 25 and 26, 2013, an NIH BD2K-sponsored workshop was held at NIH on Frameworks for Community Based Standards Efforts. One of the themes that emerged from the workshop was that investigators or other users have to choose among the wide variety of sometimes overlapping standards. They require information about these data-related standards in order to choose those best suited to their research. Such information would be most useful if it were easy-to-find, systematically organized, and consistently presented in a curated, trusted, and publicly available resource.

Modern biomedical research generates huge volumes of heterogeneous data through a worldwide network of organizations. Extracting, storing, analyzing, sharing, harmonizing, and integrating the data are critical but present many challenges. The use of data-related standards is the first step in ensuring that data can be shared and understood. Standards applied to data facilitate the flow of data between various data resources. They provide a well-defined syntax along with semantics and definitions for methods, protocols, terminologies, common data elements, and specifications for the collection, exchange, storage, and retrieval of information associated with these data. The selection of a particular data-related standard determines whether and how data may be shared, integrated, and/or transformed into another format for use with other specific data, tools, and resources. In the past, standards for biomedical data were often selected without consideration of issues beyond those related to interacting with nearby systems and/or storing for legacy purposes. Now, data frequently need to be shared much more widely or reused for other purposes including data integration. As more biomedical research data become available in digital form and as the value of interconnecting heterogeneous data, tools, and resources becomes more integral to the science itself, investigators' choices of data-related standards will need to be more deliberate so as to best serve the needs of their research and potential secondary users.

Towards this goal, the NIH Big Data to Knowledge (BD2K) initiative is considering creating a publically available web portal-based NIH Standards Information Resource (NSIR) about data-related standards that are widely used in biomedical research and related activities. This information resource would help investigators identify and choose data-related standards that are best suited to their needs. It would also connect to and interact with other NIH-supported information resources, such as a planned NIH BD2K Data Discovery Index, a software index under consideration, planned BD2K Centers of Excellence, the NIH RePORTER extramural funding information query tool, and the biomedical literature (PubMed). The standards documented in this resource would be included as citable digital objects within the digital research ecosystem, or Commons, as envisioned by the BD2K (see http://www.open-bio.org/w/images/d/dd/Bosc2014-bioinform.pdf ).

Information Requested

The NIH BD2K seeks input in two broad areas, which are: (1) the useful and usable content of an NSIR (NSIR Content); and (2) the relevant existing efforts that could inform the development of and/or enhance (e.g., by synergizing with) an NIH NSIR (Current Relevant Efforts). This information would help maximize the impact of such a community resource and facilitate its use by investigators with a broad range of expertise.

(1) NSIR Content - Metadata about standards
Please comment on the metadata about standards that would provide the most benefit to end-users within an NSIR and why, and on information that may be more effectively provided through links to other resources, e.g., detailed documentation, articles describing use of the standard, etc. Items of interest may include but are not limited to:

  • Purpose for which standard is used;
  • Examples of other standards, data sets, tools, and data resources with which the standard interoperates;
  • Whether the standard is endorsed, encouraged, or used by particular organizations, initiatives or projects;
  • Whether the standard is open or proprietary;
  • How to access the standard;
  • Published or available format(s);
  • Provenance and/or point of contact for the standard;
  • When was the standard last updated and version identifier;
  • Ease of use and support available to those who want to adopt it;
  • Location of documentation about the standard;
  • How this metadata about standards might be used in the context of the planned Research Commons referenced in the Purpose and Background above.

Ideally the NSIR would contain a variety of widely used data-related standards applicable to many areas of research such as imaging or genomics standards. Comments on the standards and/or types of standards that you consider most critical to include (and why they are most critical) are invited. If applicable, comments about how these standards and/or types of standards have been produced, validated and utilized in respondents' and/or collaborators' work are also welcome. This resource would contain information on widely used data-related standards, but wide use can have different meanings, e.g., broad use across many domains or intensive use in a single or small number of domains. Comments on the criteria that might be used to determine whether a data-related standard is "widely used" are invited.

2) Current Relevant Efforts
Comments can include existing relevant resources about data standards and how they are currently useful to end users with respect to the goals stated above, as well as their limitations. In addition to mentioning organization(s) leading and participating in the resources or related efforts, respondents are encouraged to include any lessons learned from these efforts by the creators or users. Include links or references, if possible.

Submitting a Response

All responses must be submitted electronically to [email protected] by September 30, 2014. Please include the Notice number NOT-CA-14-053 in the subject line. Responses to this RFI Notice are voluntary. Responders are free to address any or all of the categories listed above. The submitted information will be reviewed by the NIH staff. Submitted information will be considered confidential.

To facilitate analysis of the responses, at the beginning of the response, please indicate which of the following viewpoints you are representing as well as other relevant information such as your organization, scientific, or clinical domain, and your role with respect to creation or use of data-related standards:

  • End-user biomedical researcher;
  • Library or information scientist;
  • Bioinformatician;
  • Data curator;
  • Research database manager;
  • Standard developer or maintainer;
  • Publisher;
  • Biotech company employee;
  • Tool or services vendor;
  • Other.

This request is for information and planning purposes only and should not be construed as a solicitation or as an obligation on the part of the Federal Government or the NIH. The NIH does not intend to make any awards based on responses to this RFI Notice or to otherwise pay for the preparation of any information submitted or for the Government's use of such information. The NIH will use the information submitted in response to this RFI Notice at its discretion and will not provide comments to any responder's submission. The information provided will be analyzed and may appear in reports. Respondents are advised that the Government is under no obligation to acknowledge receipt of the information received or provide feedback to respondents with respect to any information submitted. No proprietary, classified, confidential, and/or sensitive information should be included in a response. The NIH and the Government reserve the right to use any non-proprietary technical information in any future solicitation(s).

Inquiries

Please direct all inquiries to:

Sherri de Coronado
National Cancer Institute (NCI)
Telephone: 301-480-7172
Email: [email protected]