NOT-OD-25-081: Protecting Human Genomic Data when Developing Generative Artificial Intelligence Tools and Applications

Protecting Human Genomic Data when Developing Generative Artificial Intelligence Tools and Applications

Notice Number:

NOT-OD-25-081

Key Dates

Release Date:

March 28, 2025

Related Announcements

August 27, 2014 - NIH Genomic Data Sharing Policy. See notice NOT-OD-14-124.

Issued by

Office of The Director, National Institutes of Health (OD)

Purpose

Artificial intelligence (AI) tools and applications are proving to be transformative for driving new biomedical research advances. While development and use of generative AI is becoming increasingly prevalent, NIH urges the research community to remain vigilant of potential risks of inadvertent data disclosure when sharing AI tools and applications. Specifically, NIH reminds researchers that:

The Genomic Data Sharing (GDS) Policy and the subsequent Data Use Certification (DUC) Agreement prohibit users from distributing controlled-access data (including genomic or associated data) or their Data Derivatives to any entity or individual not identified in their Data Access Request without appropriate written approvals from the NIH. Sharing, retaining, or training generative AI models using controlled-access human genomic data may risk disclosing controlled-access data and, thus, violates the Non-Transferability provision of the DUC.
The GDS Policy and the Genomic Data User Code of Conduct state that sharing controlled-access data with public generative AI tools (e.g., third party tools) via prompts or other user interfaces is in violation of the provision on Non-Transferability, and by extension, the DUC. Similarly, Developers requesting access to controlled-access data for developer work, as defined in NOT-OD-24-157, are bound by the Non-Transferability provision in the Developer Terms of access.

Additionally, NIH considers generative AI models, including model parameters, developed by Approved Users of controlled-access data to constitute Data Derivatives as defined in the DUC provision 14. Definitions. NIH intends to provide future guidance on the responsible use and sharing of generative AI models and controlled-access data. Until that guidance is issued, as described in the DUC, Approved Users of controlled-access data may continue to develop generative AI models using the controlled-access data so long as the use is approved by NIH, but (1) may not share the model, including model parameters, except with collaborators who are also Approved Users and (2) may not retain the generative AI model, including model parameters, upon closeout of the project as instructed in provision 13. Termination and Data Destruction of the DUC. Approved Users may request to renew any expiring projects in order to continue using generative AI models until further guidance is issued.

For additional information on using controlled-access data responsibly, see the principles described in Using Genomic Data Responsibly Under the NIH Genomic Data Sharing Policy and the AI in Research: Policy Considerations and Guidance.

Inquiries

Please direct all inquiries to:

NIH Office of Science Policy

GDS@mail.nih.gov