Expired NOT-OD-22-067: Notice of Special Interest (NOSI): Administrative Supplements to Support Collaborations to Improve the AI/ML-Readiness of NIH-Supported Data

Notice of Special Interest (NOSI): Administrative Supplements to Support Collaborations to Improve the AI/ML-Readiness of NIH-Supported Data

Notice Number:

NOT-OD-22-067

Key Dates

Release Date:

February 4, 2022

First Available Due Date:

March 17, 2022

Expiration Date:

March 18, 2022

Related Announcements

PA-20-272 - Administrative Supplements to Existing NIH Grants and Cooperative Agreements (Parent Administrative Supplement Clinical Trial Optional)

NOT-OD-21-094 - Notice of Special Interest (NOSI): Administrative Supplements to Support Collaborations to Improve the AI/ML-Readiness of NIH-Supported Data

Issued by

Office of The Director, National Institutes of Health (OD)

National Eye Institute (NEI)

National Heart, Lung, and Blood Institute (NHLBI)

National Human Genome Research Institute (NHGRI)

National Institute on Aging (NIA)

National Institute of Allergy and Infectious Diseases (NIAID)

National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS)

National Institute of Biomedical Imaging and Bioengineering (NIBIB)

Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD)

National Institute on Deafness and Other Communication Disorders (NIDCD)

National Institute of Dental and Craniofacial Research (NIDCR)

National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)

National Institute on Drug Abuse (NIDA)

National Institute of Environmental Health Sciences (NIEHS)

National Institute of General Medical Sciences (NIGMS)

National Institute of Mental Health (NIMH)

National Institute of Neurological Disorders and Stroke (NINDS)

National Institute of Nursing Research (NINR)

National Institute on Minority Health and Health Disparities (NIMHD)

National Library of Medicine (NLM)

Fogarty International Center (FIC)

National Center for Complementary and Integrative Health (NCCIH)

National Center for Advancing Translational Sciences (NCATS)

National Cancer Institute (NCI)

Purpose

This Notice announces the availability of supplements to active grants which are intended to support collaborations that bring together expertise in biomedicine, data management, and artificial intelligence and machine learning (AI/ML) to make NIH-supported data useful and usable for AI/ML analytics. This initiative is aligned with the NIH Strategic Plan for Data Science, which describes actions aimed at modernizing the biomedical research data ecosystem and making data FAIR (Findable, Accessible, Interoperable, and Reusable) with high impact for open science. For the purposes of this Notice, AI/ML is inclusive of machine learning (ML), deep learning (DL), and neural networks (NN).

Background

Artificial intelligence and machine learning (AI/ML) are a collection of data-driven technologies with the potential to significantly advance biomedical research. NIH makes a wealth of biomedical data available and reusable to research communities however, not all of these data are able to be used efficiently and effectively by AI/ML applications. The goal of this Notice is to make the data generated through NIH-funded research AI/ML-ready and shared through repositories, knowledgebases or other data sharing resources.

For the purposes of this Notice, AI/ML is inclusive of machine learning (ML), deep learning (DL), and neural networks (NN). Making data AI/ML-ready is not simply formulaic. It requires engagement with and feedback from AI/ML applications. Furthermore, feedback from AI/ML applications can improve the understanding of the data to improve future re-use.

Some aspects of AI/ML-readiness are better understood than others. For example, data to be analyzed by AI/ML tools, such as PyTorch and TensorFlow, which are used to build and deploy AI/ML applications, must conform to specific data formats. The FAIR principles, through the use of data and metadata standards (ontologies, taxonomies, terminologies), facilitate combining data from different sources to support biomedical AI/ML applications.

Some other aspects of what is needed to make data AI/ML ready must be discovered through iterative and exploratory testing. These might include how to best represent information for a particular AI/ML use-case, how to correct for noise, and what level of specificity or uncertainty of labels is tolerable for a desired AI/ML application.

For many AI/ML applications, a dataset must be sufficiently large to be considered AI/ML ready. Thus, readying these data for computation necessitates knowledge of big data management practices, for example how best to prepare data to be partitioned to enable computational feasibility.

Furthermore, there are increasing expectations that AI/ML ready data be accompanied by documentation to include information about data provenance and bias to help researchers make more informed and ethical decisions about the selection of data and application of AI/ML-models. For example, imbalanced datasets can result in AI/ML algorithms that lead to biased clinical decisions and, potentially, a misalignment with NIH goals to improve minority health and reduce health disparities for marginalized populations.AI/ML-readiness should be guided by a concern for human and clinical impact and therefore requires attention to ethical, legal, and social implications of AI/ML including but not limited to (1) biases in datasets, algorithms, and applications; (2) issues related to identifiability and privacy; (3) impacts on disadvantaged or marginalized groups; (4) health disparities; and (5) unintended, adverse social, individual, and community consequences of research and development.

It is the NIH vision to establish a modernized and integrated biomedical data ecosystem that adopts the latest data science technologies, including AI and ML, and best practice guidelines arising from community consensus, such as the FAIR principles, and open-source development. This effort is described in the NIH Data Science Strategic Plan and led by the NIH Office of Data Science Strategy (ODSS).

Research Objective

This opportunity is intended to support collaborations that bring together expertise in biomedicine, data management, and AI/ML to improve the AI/ML-readiness of data generated from NIH-funded research and shared through repositories, knowledgebases or other data sharing resources.

Applications submitted in response to this NOSI are strongly encouraged to include the following information:

- Reference(s) to the data under consideration and reasons for this choice.
- Description of the potential impact of scientific advances that could be made from AI/ML applications developed with the data.
- Description of the challenges to be addressed and why the data are not currently AI/ML-ready.
- Description of the proposed method for improving the data AI/ML-readiness.
- Description of how the data will be made available to AI/ML applications and researchers, for example, through NIH repositories, NIH knowledgebases, or other data sharing resources including those appropriate for controlled access data.
- Proposal to demonstrate the use of the transformed data in an AI/ML application.
- Proposed timeline of activities and milestones for the 12-month supplementary funding period.

- Description of the relevant expertise of the supported collaboration.
- Description of how the ethical implications of data will be identified and addressed.

These supplements may be used to support a variety of activities including, but not limited to, the following:

- Identifying existing shortfalls in AI/ML-readiness and informing the preparation of data for AI/ML through, for example, AI/ML hackathons, mini AI/ML applications, citizen science challenges, or other engagements with the AI/ML community to better understand current gaps in AI/ML-readiness.
- Activities for making data AI/ML-ready that are responsive to the gaps identified. These may include, for example, cleaning or filtering data; imputing missing metadata; data pre-processing; finding data representations to improve the computational efficiency of machine learning; removing spurious artifacts, for example from heterogeneous data sources, that affect learning or inference; data cleaning, wrangling, or filtering to provide a benchmark version of the data; adoption of ontologies or other standards to improve interoperability with other data; removing or characterizing biases and structures that may affect any AI/ML model trained on the data.
- Discovering and identifying imbalances in the data, biases in data labels or metadata, or other attributes of the dataset that would help researchers make better, more ethical decisions when using the data for AI/ML.
- Addressing specific challenges related to harmonizing distributed/federated data for distributed/federated learning.
- Developing and sharing documentation, e.g. datasheets, that document the provenance, motivation, composition, collection process, recommended uses, and other relevant information for AI/ML re-users of the data, including feedback from AI/ML applications already using the data.
- Preparation of social determinants of health (SDOH) information for use in AI/ML applications.
- Preparing data for multi-modal multi-scale AI/ML applications.

These efforts are expected to be informed by best practices in data management and engagement with the AI/ML community.

Significant skills in data management and AI/ML are expected to be needed to identify and address gaps in AI- readiness. Thus, supplements are primarily intended to provide support for data management and AI/ML collaborators, engagement events such as hackathons, and computing and storage costs required to improve the AI- readiness of data.

The scope of each proposed project is defined by and limited to the aims of the funded project for which the supplement is being sought.

Applicants partnering with industry to test novel methods or infrastructures may be considered. The integration of causal models and causal inference in AI/ML is within scope.

A broad range of projects involving the management of data repositories, or other shared data resources are eligible regardless of the scientific area of emphasis. Both open and controlled access data, including clinical data, are within scope.

Awardees should be willing to participate in virtual meetings organized by NIH. Applications that are not appropriate and out of scope for this NOSI include:

- Projects with no engagement with the AI/ML community, or no AI/ML expertise in the proposed collaboration.
- Projects that do not intend to make data generated through NIH-supported research AI/ML-ready.
- Proposals to provide supplemental funding to an award that received supplemental funding under NOT- OD-21-094 (Administrative Supplements to Support Collaborations to Improve the AI/ML-Readiness of NIH-Supported Data).
- Proposals that do not explicitly meet all the requirements stated elsewhere in this NOSI.
- Proposals that are out of scope of the parent award.
- Proposals that do not intend to broadly share AI/ML-ready data by the end of the supplemental award period. Both open and controlled access data should be broadly shared, for example, through an NIH- supported repository, NIH-supported knowledgebase, or other data sharing resource.

- Proposals focused on the development and application of AI/ML algorithms that do not intend to make data AI/ML ready.

Application and Submission Information

Budget

To be eligible, the parent award must be able to receive funds in FY2022 (Oct. 1, 2021 - Sept. 30, 2022) and not be in the final year or in a no-cost extension period at the time of the award. The parent award must end on or after Sept. 30, 2023.

One-time supplement budget requests cannot exceed $200,000 direct costs. The number of awards will be contingent on availability of funds and receipt of meritorious applications. It is currently anticipated that 30 awards will be made.

Eligible Activity Codes:

Additional funds may be awarded as supplements to parent awards using any Activity Code that is listed in PA-20-272 with the following exceptions.

Small business activity codes (such as R41, R42, R43, R44, U44, and Fast Track) are excluded, as well as G20, PS1, P60, R13, U13, U42, and UG1 awards.

Note that not all participating NIH Institutes and Centers (ICs) support all the activity codes that may otherwise be allowed. Applicants are therefore strongly encouraged to consult the program officer of the parent grant to confirm eligibility.

Centers and multi-project grant mechanisms are eligible but must provide a strong justification for why existing funds cannot be reallocated toward the proposed project.

For awards that are already primarily funded to deliver reusable data to the community, applicants should provide strong justification for why additional funds are needed to support AI/ML-readiness given that these activities could have been supported through the parent award.

Additional Information

Applications for this initiative must be submitted using PA-20-272 - Administrative Supplements to Existing NIH Grants and Cooperative Agreements (Parent Admin Supp Clinical Trial Optional) or its subsequent reissued equivalent.

All instructions in the SF424 (R&R) Application Guide and PA-20-272 must be followed, with the following additions:

- Application Due Date(s) – March 17th, 2022 by 5:00 PM local time of applicant organization.
- For funding consideration, applicants must include "NOT-OD-22-067” (without quotation marks) in the Agency Routing Identifier field (box 4B) of the SF424 R&R form. Applications without this information in box 4B will not be considered for this initiative.
- Requests may be for one year of support only.
- The Project Summary should briefly summarize the parent grant and describe the goals of the supplement project.
- The Research Strategy section of the application should be limited to 5 pages.

Administrative Evaluation Process

Submitted applications must follow the guidelines of the IC that funds the parent grant. Administrative Supplements do not receive peer review. Each IC will conduct administrative reviews of applications submitted to their IC separately. The most meritorious applications will be evaluated by a trans-NIH panel of NIH staff and supported based upon availability of funds. The criteria described below will be considered in the administrative evaluation process:

What is the potential impact of the proposed work on the NIH mission? Does the potential of the AI/ML- ready data justify this additional support?

Is the proposed project technically feasible within the supplement's funding period? Why or why not?
Is the collective expertise of the proposed team adequate to achieve the proposed goals? Why or why not?
Are the proposed timelines and milestones adequate and realistic?
To what extent is the project likely to improve the utility of the data for AI/ML applications?
Are the proposed activities appropriately guided by a concern for human and clinical impact as reflected by attention paid to the ethical, legal, and social implications of data collection, use, sharing, or application of AI/ML and NIH goals to improve minority health, and reduce health disparities for marginalized populations?
Are the proposed budget and staffing levels adequate to carry out the proposed work? Is the budget reasonable and appropriate for the proposed scope of work?

Other Information:

It is strongly recommended that the applicants contact their respective program officers at the Institute supporting the parent award in advance to:

- Confirm that the supplement falls within scope of the parent award;
- Request the requirements of the IC for submitting applications for administrative supplements

Investigators planning to submit an application in response to this NOSI are also strongly encouraged to contact and discuss their proposed research/aims with the scientific contact listed on this NOSI in advance of the application receipt date.

Following submission, applicants are strongly encouraged to notify the program contact at the IC supporting the parent award that a request has been submitted in response to this NOSI in order to facilitate efficient processing of the request.

Inquiries

For further information, please consult our Frequently Asked Questions page.

Laura Biven PhD
Office of Data Science Strategy
Division of Program Coordination, Planning, and Strategic Initiatives
Office of the Director
Email: [email protected]