Enhancing Data Usage and Utility to Advance Biomedical Research

When beginning your next investigator-initiated application, consider the following NIH highlighted topic. The area of science described below is of interest to the listed NIH Institutes, Centers, and Offices (ICOs). This is not a notice of funding opportunity (NOFO).

Apply through an appropriate NIH Parent Funding Announcement or another broad NIH opportunity available on Grants.gov. Learn how to interpret and use Highlighted Topics.

Topic Description

Post Date: June 24, 2026

Expiration Date: June 24, 2028

Background

NIH aims to maximize the return on research and clinical care through investments in data generation, data infrastructure, cloud resources, FAIR Principles-guided data management and sharing, and data science. Additionally, there is growing need for novel statistical and computational methods to address unique designs and complex data types. Multiple NIH ICOs have independently invested in efforts that demonstrate the value of data sharing and use. However, more strategic alignment and coordination across NIH ICOs are needed to address data underutilization and empower the research community to generate data-driven insights, validate findings, and accelerate translation to improved health outcomes.

Purpose

This Highlighted Topic encourages rigorous, innovative research in utilizing publicly accessible, high-quality datasets. It aims to enhance reproducibility, enable data-driven hypothesis generation, accelerate scientific discovery, and ultimately improve disease prevention, diagnosis, treatment, and patient outcomes.

Exemplary areas of interest include:

  • Developing and applying advanced computational methods (including AI/ML) to enable data discovery, reuse, and access across distributed ecosystems, standardized data extraction, interoperability, and integration across diverse biomedical, clinical, environmental, behavioral, and population-level datasets.
  • Developing novel biostatistical methods to address complex study designs and analytic challenges and applying them to relevant health data.
  • Enhancing rigor, reproducibility, and data quality through replication, validation, cross-cohort analyses, and development of metrics to assess completeness, bias, and utility.
  • Designing, evaluating and implementing privacy-preserving strategies (e.g., federated learning, differential privacy) to enable responsible data sharing and use.
  • Advancing predictive, translational, and intervention research by modeling health and disease trajectories across the lifespan, identifying and validating biomarkers or other measurable indicators, and enabling innovative, adaptive, and pragmatic study designs.
  • Enabling secondary data analysis to improve population health and health systems by informing prevention, clinical care, and implementation strategies.

Participating ICOs

National Cancer Institute (NCI)

Additional NCI interests include but are not limited to:

  • Elucidating mechanisms of tumor initiation, progression, relapse, and therapeutic resistance.
  • Enabling novel adaptive clinical trial designs via digital twins, synthetic control arms, or other approaches, or AI/data-driven accrual and stratification.
  • Integrating multi-omic, multi-modal data to improve predictions of risks, toxicity and response of therapies, and interventions to address population and health system characteristics.

Applicants are encouraged to discuss potential research projects (e.g., R01, R03, R21) with program staff before submission. IC may dedicate funds available to support applications in this Topic area depending upon the availability of funds, the number of meritorious applications, and competing ICO priorities.

IC may dedicate funds available to support applications in this Topic area depending upon the availability of funds, the number of meritorious applications, and competing ICO priorities.
ICO Scientific Contact:
Emily Boja
[email protected]

Jiayin (Jerry) Li
[email protected]

Danielle Daee, Ph.D.
[email protected]

Miguel R. Ossandon, Ph.D.
[email protected]

Wendy Wang, Ph.D.
[email protected]

National Center for Complementary and Integrative Health (NCCIH)

NCCIH encourages applications leveraging data science, AI/ML, and advanced computational approaches to improve rigor, reproducibility, and translation of complementary and integrative health (CIH) research. NCCIH supports tools and methods that enable integration, secure use, and analysis of data to accelerate discovery and improve whole person health outcomes. Research areas include but are not limited to:

  • Development of AI/ML, causal AI, and computational tools to harmonize, integrate, and visualize multiscale CIH data
  • Development of methods to assess data quality, reproducibility, and rigor in CIH and whole person research
  • Development of privacy-preserving and secure data-sharing approaches (e.g., federated learning) to enable multisite CIH analyses
  • Development of AI-enabled models to predict symptom trajectories, treatment response, and patient-reported outcomes
  • Data-driven approaches to improve access, recruitment, personalization, and implementation of CIH interventions
IC may give special consideration to support meritorious applications in this topic area.
ICO Scientific Contact:
Emrin Horgusluoglu
[email protected]

National Eye Institute (NEI)

NEI has supported vision research that generated rich multimodal datasets spanning imaging, visual function testing, multi-omics, and clinical phenotypes across diverse eye diseases. Prioritized areas for this HT include , but are not limited to:

  • Harmonizing Optical Coherence Tomography +/- Angiography (OCT/OCTA), fundus photography, other imaging modalities, and visual fields across devices, vendors, and institutions.
  • Creating open-access real-world datasets aligned with FAIR principles to enable knowledge discovery and clinical applications.
  • Integrating imaging with structured clinical data and patient-reported outcomes to improve understanding of disease onset and progression.
  • Developing tools and metrics to assess data quality, reliability, interoperability, and model transportability across settings.
  • Integrating multi-omics and multimodal data across basic, preclinical, and clinical research to enable mechanistic insights, biomarker discovery, and therapeutic development.
ICO Scientific Contact:
James Gao, Ph.D.
[email protected]

National Human Genome Research Institute (NHGRI)
  • NHGRI is interested in secondary analysis research that leverages existing genomic and multi-omic datasets to advance understanding of gene-disease relationships, improve genomic variant interpretation, enhance diagnostic yield in rare and undiagnosed diseases, develop and validate polygenic risk models, and accelerate the translation of genomic discoveries into clinical care.
  • NHGRI also encourages innovative research through secondary analyses of data accessible via the NHGRI AnVIL Cloud platform, utilizing the platform’s comprehensive analysis tools and services.
ICO Scientific Contact:
NHGRI Research Funding
[email protected]

National Institute on Aging (NIA)

Examples of research areas relevant to NIA may include, but not limited to:

  • Secondary analyses of existing data to elucidate the etiology, disease trajectories, and risk factors influencing the development and progression of Alzheimer’s disease and related dementias (AD/ADRD), aging-related diseases, and comorbidities
  • Secondary analyses of longitudinal cohorts and linked biomedical, administrative, and social data to identify drivers of variation in aging-related chronic conditions, including AD/ADRD
  • Use of existing data to uncover molecular, genetic, cellular, and physiological mechanisms underlying aging and age-related changes across the lifespan in humans and other organisms
  • Development of analytic methods and tools to improve the use and interpretability of large datasets on AD/ADRD, aging-related diseases, and comorbidities
  • Efforts to enhance the accessibility and utilization of NIA-supported repositories to accelerate discoveries in age-related conditions and comorbidities
ICO Scientific Contact:
Rebekah Feng, Ph.D.
[email protected]

Damali Martin, Ph.D., MPH
[email protected]

Rosaly Correa-de-Araujo, MD, M.Sc., Ph.D.
[email protected]

Yi-Ping Fu, Ph.D.
[email protected]

Frank Bandiera, Ph.D., MPH
[email protected]

National Institute on Alcohol Abuse and Alcoholism (NIAAA)

NIAAA’s areas of interest include, but are not limited to:

  • Integrate clinical (including laboratory and imaging measures), behavioral, environmental, real-world, and omics data to study alcohol use patterns, alcohol use disorder (AUD) progression, and related health outcomes, including comorbid mental and physical health conditions, across the lifespan, to inform prevention and treatment.
  • Develop and apply computational and machine learning methods to harmonize alcohol measures and standardize alcohol-related phenotypes to improve reproducibility and cross-study comparability.
  • Identify and validate biomarkers and predictive models for alcohol misuse and AUD risk, treatment response, relapse, and recovery.
  • Use established NIH cohorts, such as All of Us, Adolescent Brain Cognitive Development (ABCD), HEALthy Brain and Child Development (HBCD) , and Add Health to advance data-driven, individualized prevention and treatment strategies.
ICO Scientific Contact:
Wenxing Zha, Ph.D.
[email protected]

Elizabeth Powell, Ph.D.
[email protected]

Chamindi Seneviratne, M.D.
[email protected]

National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS)

NIAMS seeks to maximize the scientific value of existing datasets by supporting secondary analyses and the development of innovative analytical methodologies that advance research in arthritis, musculoskeletal, and skin diseases. 

Two research approaches may be considered: 

  • secondary analyses of existing data from databases (e.g., electronic health record, registries, population-based cohorts, surveys, imaging, labs, claims, environmental data, and multi-omics) relevant to NIAMS mission areas, including biomedical, clinical, or public health, and 

development of statistical, computational, or data science methodologies that enhance existing approaches for analyzing complex health data relevant to NIAMS scientific priorities.

ICO Scientific Contact:
Kamil Barbour, PhD
[email protected]

National Institute on Drug Abuse (NIDA)

This topic invites projects that leverage and enhance existing social science, behavioral, administrative, clinical, and neuroimaging datasets through innovative secondary analyses. Proposed research should

  • strengthen the rigor, reproducibility, and utility of data resources through methodological advances, such as improved data accessibility, integration, harmonization, interoperability, and reusability. 
  • generate knowledge to advance understanding of the causes, patterns, and health impacts of substance use, HIV, and related conditions
  • inform the development, evaluation, and regulatory advancement of safe and effective therapeutics and scalable, evidence-based strategies to improve prevention, treatment quality, and health outcomes for individuals and communities affected by substance use disorders (SUDs).
ICO Scientific Contact:
Marsha Lopez
[email protected]

Jana Drgonova
[email protected]

National Institute of Dental and Craniofacial Research (NIDCR)

NIDCR supports applications that maximize the value of existing datasets through secondary analyses and innovative analytic methods to advance understanding, prevention, and treatment of dental, oral, and craniofacial (DOC) conditions. This includes:

  • Secondary data analyses using existing data and databases relevant to DOC, and oral-systemic health and/or practice.
  • Development of statistical or computational methodologies that are poised to improve or advance extant methods for analyzing DOC, or oral-systemic health data.

Priority areas include behavioral sciences (behavior change, intervention mechanisms, nutrition, health education, adherence); clinical and population research (cohorts, trials, comparative effectiveness and safety, natural experiments, health economics, meta-analyses); and translational data science (multi-modal data integration, AI/ML/DL, federated learning frameworks, in silico validation, disease risk prediction).

IC may dedicate funds available to support applications in this Topic area depending upon the availability of funds, the number of meritorious applications, and competing ICO priorities.
IC may give special consideration to support meritorious applications in this topic area.
ICO Scientific Contact:
William Elwood, PhD
[email protected]

Lorena Baccaglini, DDS, MS, PhD
[email protected]

Noffisat Oki, PhD
[email protected]

National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)

NIDDK has made substantial investments in large-scale clinical consortia, longitudinal cohorts, and data and sample repositories that have generated rich, multidimensional resources.  Many of these datasets have been newly released, expanded, or harmonized through NIDDK-supported repositories, creating discovery opportunities beyond original scope. 

Research priorities include:  

  • Integration, harmonization, and reanalysis of existing datasets and specimens to achieve a comprehensive understanding of patient health and identify novel mechanisms, subgroups, and trajectories;  
  • Generation of privacy-preserving synthetic clinical data and cohorts to support data sharing, method development, and benchmarking; 
  • Validation of AI/ML models; and 
  • Development and application of: 
    •  AI/ML tools to predict disease progression and outcomes; and
    • foundation models and methods which address analytical challenges. 
IC may give special consideration to support meritorious applications in this topic area.
ICO Scientific Contact:
Daniel Gossett, Ph.D. Kidney, Urology and Hematology
[email protected]

Xujing Wang, Ph.D. Diabetes, Endocrinology, and Metabolism
[email protected]

Veerasamy Ravichandran, Ph.D. Digestive Diseases and Nutrition
[email protected]

Rebecca Rodriguez, Ph.D. NIDDK Central Repository
[email protected]

Emily Leary, Ph.D. NIDDK Biostatistics Program
[email protected]

National Institute of Environmental Health Sciences (NIEHS)

NIEHS is interested in:

  • Analyses in existing data to identify the role of environmental exposures in disease etiology, disease mechanisms, characterization of the exposome, statistical methods development, and complex exposomics data integration
  • Development of analytical pipelines and novel statistical or AI methods for complex exposomics data analysis, including open-source code and clear instruction for implementation
  • Replication and validation of findings
  • Leveraging large administrative datasets such as electronic health records and geospatial data to characterize environmental exposures and health
  • Applying robust evidence from epidemiological, animal, and organoid studies to in silico trial designs such as digital twin studies
  • Training new researchers in data reuse and secondary data analysis
ICO Scientific Contact:
Bonnie Joubert, Ph.D.
[email protected]

National Institute of Mental Health (NIMH)

The NIMH supports secondary analyses of existing human mental health datasets that advance interoperability, reproducibility, and clinical actionability in mental health research and intervention.  Priority areas include:

  • Integration of clinical, cognitive, neuroimaging, neurophysiology, genomic behavioral, and sensor and mobile/wearable device data to identify and validate markers of mental health risk and resilience, and to establish clinically meaningful definitions of sub-populations of individuals with, or at risk, for mental illnesses and related behaviors. 
  • Develop privacy-preserving approaches that strengthen causal inference, predictive modeling, and external validation—moving beyond exploratory correlations when delivering reusable, open workflows and tools to the research community. 
  • Explicitly characterize the heterogeneity within diagnostic categories to improve transdiagnostic utility and accelerate translation to real-world mental health clinical practice. 
ICO Scientific Contact:
Christina Liu, PhD PE
[email protected]

Michele Ferrante, PhD
[email protected]

National Library of Medicine (NLM)

The National Library of Medicine (NLM) is committed to advancing rigorous, innovative research that leverages multi-modal, data-driven approaches and the secondary analysis of publicly accessible, high-quality datasets to accelerate scientific discovery and improve health outcomes. As the world’s largest biomedical library and a leader in biomedical informatics, NLM recognizes that rigorous secondary analysis of clinical, public health, and real-world data is fundamental to advancing discovery, strengthening reproducibility, and advancing optimal health outcomes for all. Leveraging diverse data sources, while ensuring privacy, security, and ethical stewardship, enables the development and validation of innovative analytic methods, including novel statistical, computational, and AI-driven approaches.

ICO Scientific Contact:
Goutham Reddy, MD MS
[email protected]

Office of Data Science Strategy (ODSS)

ODSS encourages investigator-initiated proposals such as addressing development and enhancement of standards, data models, and data sharing to improve scientific discovery, reproducibility, accessibility, impact, and efficacy.

IC may give special consideration to support meritorious applications in this topic area.
This office does not award grants. Applications must be relevant to the objectives of at least one of the participating Institutes or Centers listed in this topic.
ICO Scientific Contact:
Shu Hui Chen
[email protected]

NIH Office of Data Science Strategy (ODSS)
[email protected]

Office of Research on Women's Health (ORWH)

The Office of Research on Women’s Health (ORWH) is interested in projects that: 

  • Incorporate consideration of sex as a biological variable (SABV) in discovery, replication, and reproducibility study designs and analyses
  • Develop female-specific common data elements (CDE), including menopause
  • Harness computational models to investigate sex differences

The Office of Autoimmune Disease Research in ORWH (OADR-ORWH) is interested in: 

  • Developing CDE to support data extraction, harmonization, and interoperability for autoimmune disease research
  • Utilizing federated data platforms to enhance pattern recognition in complex multiomic datasets, enabling insights into autoimmune disease pathogenesis, co-occurring autoimmune diseases, and shared pathogenic pathways 
  • Harnessing computational models to optimize clinical trial design, including use of digital twins and synthetic control arms and leveraging existing and new clinical trial and registry data to advance autoimmune disease research
This office does not award grants. Applications must be relevant to the objectives of at least one of the participating Institutes or Centers listed in this topic.
ICO Scientific Contact:
Elena Gorodetsky, M.D., Ph.D.
[email protected]

Victoria Shanmugam, MBBS, MRCP, FACR, CCD
[email protected]


For technical issues E-mail OER Webmaster