Notice of Special Interest (NOSI): Administrative Supplements to Support Collaborations to Improve the AI/ML-Readiness of NIH-Supported Data
Notice Number:

Key Dates

Release Date:

March 6, 2023

First Available Due Date:
May 16, 2023
Expiration Date:
May 17, 2023

Related Announcements

PA-20-272 - Administrative Supplements to Existing NIH Grants and Cooperative Agreements (Parent Admin Supp Clinical Trial Optional)

Reissue of NOT-OD-22-067: Notice of Special Interest (NOSI): Administrative Supplements to Support Collaborations to Improve the AI/ML-Readiness of NIH-Supported Data

Issued by

Office of Data Science Strategy (ODSS)

National Eye Institute (NEI)

National Heart, Lung, and Blood Institute (NHLBI)

National Human Genome Research Institute (NHGRI)

National Institute of Allergy and Infectious Diseases (NIAID)

National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS)

National Institute of Biomedical Imaging and Bioengineering (NIBIB)

Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD)

National Institute on Deafness and Other Communication Disorders (NIDCD)

National Institute of Dental and Craniofacial Research (NIDCR)

National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)

National Institute on Drug Abuse (NIDA)

National Institute of General Medical Sciences (NIGMS)

National Institute of Mental Health (NIMH)

National Institute of Neurological Disorders and Stroke (NINDS)

National Institute of Nursing Research (NINR)

National Institute on Minority Health and Health Disparities (NIMHD)

National Library of Medicine (NLM)

Fogarty International Center (FIC)

National Center for Complementary and Integrative Health (NCCIH)

Office of Strategic Coordination (Common Fund)

National Cancer Institute (NCI)

All applications to this funding opportunity announcement should fall within the mission of the Institutes/Centers. The following NIH Offices may co-fund applications assigned to those Institutes/Centers.

Division of Program Coordination, Planning and Strategic Initiatives, Office of Research Infrastructure Programs (ORIP)


This Notice announces the availability of supplements to active grants which are intended to support collaborations that bring together expertise in biomedicine, data management, and artificial intelligence and machine learning (AI/ML) to make NIH-supported data useful and usable for AI/ML analytics. This initiative is aligned with the NIH Strategic Plan for Data Science, which describes actions aimed at modernizing the biomedical research data ecosystem and making data FAIR (Findable, Accessible, Interoperable, and Reusable) with high impact for open science. For the purposes of this Notice, AI/ML is inclusive of machine learning (ML), deep learning (DL), and neural networks (NN).


Artificial intelligence and machine learning (AI/ML) are a collection of data-driven technologies with the potential to significantly advance biomedical research. NIH makes a wealth of biomedical data available and reusable to research communities, however, not all of these data are able to be used efficiently and effectively by AI/ML applications. The goal of this Notice is to make the data generated through NIH-funded research AI/ML-ready and shared through repositories, knowledgebases or other data sharing resources.

For the purposes of this Notice, AI/ML is inclusive of machine learning (ML), deep learning (DL), and neural networks (NN). Making data AI/ML-ready is not simply formulaic. It requires engagement with and feedback from AI/ML applications. Furthermore, feedback from AI/ML applications can improve the understanding of the data to improve future re-use.

Some aspects of AI/ML-readiness are better understood than others. For example, data to be analyzed by AI/ML tools, such as PyTorch and TensorFlow, which are used to build and deploy AI/ML applications, must conform to specific data formats. The FAIR principles, through the use of data and metadata standards (ontologies, taxonomies, terminologies), facilitate combining data from different sources to support biomedical AI/ML applications.

Some other aspects of what is needed to make data AI/ML ready must be discovered through iterative and exploratory testing. These might include how to best represent information for a particular AI/ML use-case, how to correct for noise, and what level of specificity or uncertainty of labels is tolerable for a desired AI/ML application.

For many AI/ML applications, the training dataset must be sufficiently large to be considered AI/ML ready. Thus, readying these data for computation necessitates knowledge of big data management practices, for example how best to prepare data to be partitioned to enable computational feasibility.

Decentralized machine learning (sometimes referred to as Federated or distributed learning) is a paradigm of machine learning where a model is trained iteratively on data in multiple locations. This paradigm can facilitate the use of data that, for privacy or other reasons, cannot be aggregated or moved. Preparing data for decentralized ML requires harmonization and testing as well as capabilities for standardized access and, possibly, enhanced data and model governance to protect privacy.

Furthermore, there are increasing expectations that AI/ML ready data be accompanied by documentation to include information about data provenance and bias to help researchers make more informed and ethical decisions about the selection of data and application of AI/ML-models. For example, imbalanced datasets can result in AI/ML algorithms that lead to biased clinical decisions and, potentially, a misalignment with NIH goals to improve minority health and reduce health disparities for marginalized populations. AI/ML-readiness should be guided by a concern for human and clinical impact and therefore requires attention to ethical, legal, and social implications of AI/ML including but not limited to (1) biases in datasets, algorithms, and applications; (2) issues related to identifiability and privacy; (3) impacts on disadvantaged or marginalized groups; (4) health disparities; and (5) unintended, adverse social, individual, and community consequences of research and development.

It is the NIH vision to establish a modernized and integrated biomedical data ecosystem that adopts the latest data science technologies, including AI and ML, and best practice guidelines arising from community consensus, such as the FAIR principles, and open-source development. This effort is described in the NIH Data Science Strategic Plan and led by the NIH Office of Data Science Strategy (ODSS).

Research Objective

This opportunity is intended to support collaborations that bring together expertise in biomedicine, data management, and AI/ML to improve the AI/ML-readiness of data generated from NIH-funded research and shared through repositories, knowledgebases or other data sharing resources.

Applications submitted in response to this NOSI are strongly encouraged to include the following information:

  • Reference(s) to the data under consideration and reasons for this choice.
  • Description of the potential impact of scientific advances that could be made from AI/ML applications developed with the data.
  • Description of the challenges to be addressed and why the data are not currently AI/ML-ready.
  • Description of the proposed method for improving the data AI/ML-readiness.
  • Description of how the data will be made available to AI/ML applications and researchers, for example, through NIH repositories, NIH knowledgebases, or other data sharing resources including those appropriate for controlled access data.
  • Proposal to demonstrate the use of the transformed data in an AI/ML application.
  • Proposed timeline of activities and milestones for the 12-month supplementary funding period.
  • Description of the relevant expertise of the supported collaboration.
  • Description of how the ethical implications of data will be identified and addressed, including plans to develop and share documentation or datasheets that describes the motivation, composition, collection process and pre-processing, anticipated use cases, and other information relevant for ethical reuse.

NIH is particularly interested in proposals that will advance the ethical development of AI-ready data, and transparent practices that enhance the ethical re-use of data for AI/ML applications.

These supplements may be used to support a variety of activities including, but not limited to, the following:

  • Identifying existing shortfalls in AI/ML-readiness and informing the preparation of data for AI/ML through, for example, AI/ML hackathons, mini AI/ML applications, citizen science challenges, or other engagements with the AI/ML community to better understand current gaps in AI/ML-readiness.
  • Activities for making data AI/ML-ready that are responsive to the gaps identified. These may include, for example, cleaning or filtering data; imputing missing metadata; data pre-processing; finding data representations to improve the computational efficiency of machine learning; removing spurious artifacts, for example from heterogeneous data sources, that affect learning or inference; data cleaning, wrangling, or filtering to provide a benchmark version of the data; adoption of ontologies or other standards to improve interoperability with other data; removing or characterizing biases and structures that may affect any AI/ML model trained on the data.
  • Discovering and identifying imbalances in the data, biases in data labels or metadata, or other attributes of the dataset that would help researchers make better, more ethical decisions when using the data for AI/ML.
  • Addressing specific challenges related to harmonizing distributed/federated data for distributed/federated learning.
  • Developing and sharing documentation, e.g. datasheets, that document the provenance, motivation, composition, collection process, recommended uses, and other relevant information for AI/ML re-users of the data, including feedback from AI/ML applications already using the data.
  • Preparation of social determinants of health (SDOH) information for use in AI/ML applications.
  • Preparing data for multi-modal multi-scale AI/ML applications.

These efforts are expected to be informed by best practices in data management and engagement with the AI/ML community.

Significant skills in data management and AI/ML are expected to be needed to identify and address gaps in AI-readiness. Thus, supplements are primarily intended to provide support for data management and AI/ML collaborators, engagement events such as hackathons, and computing and storage costs required to improve the AI-readiness of data.

The scope of each proposed project is defined by and limited to the aims of the funded project for which the supplement is being sought.

Applicants partnering with industry to test novel methods or infrastructures may be considered. The integration of causal models and causal inference in AI/ML is within scope.

A broad range of projects involving the management of data repositories, or other shared data resources are eligible regardless of the scientific area of emphasis. Both open and controlled access data, including clinical data, are within scope.

Awardees should be willing to participate in virtual meetings organized by NIH. Applications that are not appropriate and out of scope for this NOSI include:

  • Projects with no engagement with the AI/ML community, or no AI/ML expertise in the proposed collaboration.
  • Projects that do not intend to make data generated through NIH-supported research AI/ML-ready.
  • Proposals to provide supplemental funding to an award that received supplemental funding under NOT-OD-21-094 or NOT-OD-22-067 (Administrative Supplements to Support Collaborations to Improve the AI/ML-Readiness of NIH-Supported Data).
  • Proposals that do not explicitly meet all the requirements stated elsewhere in this NOSI.
  • Proposals that are out of scope of the parent award.
  • Proposals that do not intend to broadly share AI/ML-ready data by the end of the supplemental award period. Both open and controlled access data should be broadly shared, for example, through an NIH-supported repository, NIH-supported knowledge base, or other data sharing resource.
  • Proposals focused on the development and application of AI/ML algorithms that do not intend to make data AI/ML ready.

ICO Specific Considerations

Office of Strategic Coordination

The NIH Office of Strategic Coordination (Common Fund - supports multiple transformative research programs that generate new technologies, methods, and data. Many of these programs produced rich public data sets containing multi-dimensional molecular and phenotypic data from humans and model organisms. Established Common Fund data sets listed below are well-poised for increased community use:

OSC is interested in proposals that substantially leverage at least one of the above datasets. Substantial leverage is defined as use and citation of the dataset(s) in the envisioned research products of the proposed work (manuscripts, presentations, book chapters, portals, etc.).

Application and Submission Information


To be eligible, the parent award must be able to receive funds in FY2023 (Oct. 1, 2022 - Sept. 30, 2023). Applicants are strongly encouraged to contact the program officer of the parent award to confirm eligibility.

Funds must be used to meet increased costs that are within the scope of the approved award, but were unforeseen when the new or renewal application or grant progress report for non-competing continuation support was submitted.

Funds can be used to cover cost increases that are associated with achieving certain new research objectives, as long as the research objectives are within the original scope of the peer reviewed and approved project, or the cost increases are for unanticipated expenses within the original scope of the project.One-time supplement budget requests cannot exceed $200,000 direct costs. The number of awards will be contingent on availability of funds and receipt of meritorious applications.

Eligible Activity Codes:

Additional funds may be awarded as supplements to parent awards using any Activity Code that is listed in PA-20-272 with the following exceptions: Small business activity codes (such as R41, R42, R43, R44, U43, U44, and Fast Track) are excluded, as well as S10, P60, R13, and U13 awards.

Note that not all participating NIH Institutes and Centers (ICs) support all the activity codes that may otherwise be allowed. Applicants are therefore strongly encouraged to consult the program officer of the parent grant to confirm eligibility.

For awards that are already primarily funded to deliver reusable data to the community, applicants should provide strong justification for why additional funds are needed to support AI/ML-readiness given that these activities could have been supported through the parent award. This award cannot be used to supplement the NIH Data Management Sharing (DMS) costs in parent award.

Additional Information

Applications for this initiative must be submitted using PA-20-272 - Administrative Supplements to Existing NIH Grants and Cooperative Agreements (Parent Admin Supp Clinical Trial Optional) or its subsequent reissued equivalent.

All instructions in the SF424 (R&R) Application Guide and PA-20-272 must be followed, with the following additions:

  • Application Due Date(s) May 16, 2023 by 5:00 PM local time of applicant organization.
  • For funding consideration, applicants must include "NOT-OD-23-082 (without quotation marks) in the Agency Routing Identifier field (box 4B) of the SF424 R&R form. Applications without this information in box 4B will not be considered for this initiative.
  • Requests may be for one year of support only.
  • The proposed project period can not exceed the project period of the parent award.
  • The Project Summary should briefly summarize the parent grant and describe the goals of the supplement project.
  • The Research Strategy section of the application should be limited to 5 pages.

Administrative Evaluation Process

Submitted applications must follow the guidelines of the IC that funds the parent grant. Administrative Supplements do not receive peer review. Each IC will conduct administrative reviews of applications submitted to their IC separately. The most meritorious applications will be evaluated by a trans-NIH panel of NIH staff and supported based upon availability of funds. The criteria described below will be considered in the administrative evaluation process:

  1. What is the potential impact of the proposed work on the NIH mission? Does the potential of the AI/ML-ready data justify this additional support?
  2. Is the proposed project technically feasible within the supplement's funding period? Why or why not?
  3. Is the collective expertise of the proposed team adequate to achieve the proposed goals? Why or why not?
  4. Are the proposed timelines and milestones adequate and realistic?
  5. To what extent is the project likely to improve the utility of the data for AI/ML applications?
  6. Are the proposed activities appropriately guided by a concern for human and clinical impact as reflected by attention paid to the ethical, legal, and social implications of data collection, use, sharing, or application of AI/ML and NIH goals to improve minority health, and reduce health disparities for marginalized populations?
  7. To what extent will documentation or datasheets be shared that describes the motivation, composition, collection process and pre-processing, anticipated use cases, and other information relevant for ethical reuse?
  8. Are the proposed budget and staffing levels adequate to carry out the proposed work? Is the budget reasonable and appropriate for the proposed scope of work?

Other Information:

It is strongly recommended that the applicants contact their respective program officers at the Institute supporting the parent award in advance to:

Confirm that the supplement falls within scope of the parent award;

Request the requirements of the IC for submitting applications for administrative supplements

Investigators planning to submit an application in response to this NOSI are also strongly encouraged to contact and discuss their proposed research/aims with the scientific contact listed on this NOSI in advance of the application receipt date.

Following submission, applicants are strongly encouraged to notify the program contact at the IC supporting the parent award that a request has been submitted in response to this NOSI in order to facilitate efficient processing of the request.

NIH is particularly interested in applications that enhance the participation of individuals from groups that are underrepresented in the biomedical, clinical, behavioral, and social sciences. (See NIH’s Interest in Diversity NOT-OD-20-031.)


For further information, please consult our Frequently Asked Questions page.

Please direct all inquiries to:

Fenglou Mao, PhD
Office of Data Science Strategy (ODSS)
Division of Program Coordination, Planning, and Strategic Initiatives
Office of the Director