Rigor and Reproducibility

Scientific rigor and transparency in conducting biomedical research is key to the successful application of knowledge toward improving health outcomes. The information provided on this website is designed to assist the extramural community in addressing rigor and transparency in NIH grant applications and progress reports.   

On This Page:


The NIH strives to exemplify and promote the highest level of scientific integrity, public accountability, and social responsibility in the conduct of science. Updates to grant applications instructions and review language are intended to:   

  • clarify long-standing expectations to ensure that NIH is funding the best and most rigorous science,
  • highlight the need for applicants to describe details that may have been previously overlooked,
  • highlight the need for reviewers to consider such details in their reviews through updated review language, and
  • minimize additional burden.


Guidance: Rigor and Reproducibility in Grant Applications

The NIH is committed to promoting rigorous and transparent research in all areas of science supported by a variety of grant programs. Updates to application instructions and review language intended to enhance reproducibility through rigor and transparency have been implemented for research grants and mentored career development awards. Updates to institutional training grants, institutional career development awards (K12/KL2) and individual fellowships will be forthcoming in 2017 or later. 

Research Grants and Mentored Career Development Awards

The updates to NIH research grant and career development award application instructions and review language focus on four key areas: 

  1. The scientific premise of the proposed research
    • The scientific premise for an application is the research that is used to form the basis for the proposed research question(s). NIH expects applicants to describe the general strengths and weaknesses of the prior research being cited by the applicant as crucial to support the application. It is expected that this consideration of general strengths and weaknesses could include attention to the rigor of the previous experimental designs, as well as the incorporation of relevant biological variables and authentication of key resources.
    • See related FAQs, blog post
  2. Rigorous experimental design for robust and unbiased results
    • Scientific rigor is the strict application of the scientific method to ensure robust and unbiased experimental design, methodology, analysis, interpretation and reporting of results. This includes full transparency in reporting experimental details so that others may reproduce and extend the findings.
    • See related FAQs, blog post
  3. Consideration of relevant biological variables
    • Biological variables, such as sex, age, weight, and underlying health conditions, are often critical factors affecting health or disease. In particular, sex is a biological variable that is frequently ignored in animal study designs and analyses, leading to an incomplete understanding of potential sex-based differences in basic biological function, disease processes and treatment response.
    • NIH expects that sex as a biological variable will be factored into research designs, analyses, and reporting in vertebrate animal and human studies. Strong justification from the scientific literature, preliminary data or other relevant considerations must be provided for applications proposing to study only one sex.
    • See related FAQs, blog posts, Article Link to Non-U.S. Government Site - Click for Disclaimer
  4. Authentication of key biological and/or chemical resources
    • Key biological and/or chemical resources include, but are not limited to, cell lines, specialty chemicals, antibodies and other biologics. Key biological and/or chemical resources may or may not be generated with NIH funds and:
      1. may differ from laboratory to laboratory or over time;
      2. may have qualities and/or qualifications that could influence the research data;
      3. are integral to the proposed research.
    • The quality of resources used to conduct research is critical to the ability to reproduce the results. Each investigator will have to determine which resources used in their research fit these criteria and are therefore key to the proposed research.
    • See related FAQs, blog post

Overview of New Guidelines for Rigor in Your Application

                Infographic courtesy of Ms. Nichole Swan, Dr. Shana Spindler, and Dr. Yvette Pittman of the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD). Infographic courtesy of Ms. Nichole Swan, Dr. Shana Spindler, and Dr. Yvette Pittman of the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD).

Institutional Training Grants, Institutional Career Development, and Individual Fellowships

The NIH plans to require formal instruction in rigorous experimental design and transparency to enhance reproducibility for institutional training, institutional career development, and individual fellowship applications no sooner than 2017. See NOT-OD-16-034.

When implemented, applications will be expected to provide the following:

Institutional training grant applications will be required to include within the training program plan a summary of the instruction planned for all predoctoral and postdoctoral trainees to ensure the knowledge and skills required to design and conduct rigorous, well-controlled experiments that consider all relevant biological variables, use authenticated biological and chemical resources, and apply appropriate statistical tests for data analyses.  In addition, a separate attachment will be required to describe in more detail the instructional content and curricular content. 

Institutional career development applications (K12/KL2) will be required to include within the career development program plan a summary of the instruction planned for all scholars to ensure the knowledge and skills required to design and conduct rigorous, well-controlled experiments that consider all relevant biological variables, use authenticated biological and chemical resources, and apply appropriate statistical tests for data analyses.  In addition, a separate attachment will be required to describe in more detail the instructional content and curricular content. 

Individual fellowship applications will be required to summarize in the research strategy section plans to ensure rigorous, well-controlled experiments that consider all relevant biological variables, use authenticated biological and chemical resources, and apply appropriate statistical tests for data analyses.  In addition more detailed description of instruction in rigorous experimental design to ensure reproducibility will be required in the section on Institutional Environment and Commitment to Training. 



  • Staff Training Module: General Policy Overview (compiled by NIH OER, 10/30/2015)
  • Reviewer Guidance on Rigor and Transparency (compiled by NIH OER, 03/21/2016)
  • Frequently Asked Questions
  • Examples of Rigor in Applications

    These brief excerpts are taken from awarded applications reviewed under a pilot FOA for rigorous experimental design, which is only one part of the updated instruction and review language for January 25, 2016 and beyond. Note that these examples were selected based on high overall impact scores and positive reviewer comments specific to rigor. These examples are provided to show how elements of rigor and transparency have been succinctly provided in applications; they may not represent all of the aspects and may still have room for improvement. These examples may be updated as applications are reviewed and awarded under the revised rigor and transparency review language.

    Example #1

    Aim 3: Male and female mice will be randomly allocated to experimental groups at age 3 months. At this age the accumulation of CUG repeat RNA, sequestration of MBNL1, splicing defects, and myotonia are fully developed. The compound will be administered at 3 doses (25%, 50%, and 100% of the MTD) for 4 weeks, compared to vehicle-treated controls. IP administration will be used unless biodistribution studies indicate a clear preference for the IV route. A group size of n = 10 (5 males, 5 females) will provide 90% power to detect a 22% reduction of the CUG repeat RNA in quadriceps muscle by qRT-PCR (ANOVA, α set at 0.05). The treatment assignment will be blinded to investigators who participate in drug administration and endpoint analyses. This laboratory has previous experience with randomized allocation and blinded analysis using this mouse model [refs]. Their results showed good reproducibility when replicated by investigators in the pharmaceutical industry [ref].

    Example #2

    Aim 1: Primary screen: In this high throughput screening assay, we combined the SMN promoter with exons 1-6 and an exon 7 splicing cassette in a single construct that should respond to compounds that increase SMN transcription, exon 7 inclusion, or potentially stabilize the SMN RNA or protein [refs]. The details of the assay and the SMN2-luciferase reporter HEK393 cell line have been extensively validated [refs]. Each point is run in triplicate, the compounds are tested on three separate occasions, and the results are averaged to give an EC50 with standard deviation. Secondary screen: …We analyze SMN protein levels by dose response in quantitative immunoblots with statistical analysis by one-way ANOVA with post-hoc analysis using Dunnett or Bonferroni, as appropriate.

    Aim 2: Each set of compounds will include a blinded negative control compound that has been determined to be inactive and that is solubilized in the same manner as test compounds. Mice will be randomly assigned within a litter, and data will be collected and submitted to the PI. For compounds that demonstrate extended survival, the PI will be sure to have these tested in {the collaborators’} labs, and data will be merged and evaluated. To calculate the number of the experimental mice, we will perform an SSD sample size power analysis to ensure that the appropriately minimal number of mice is used in each experimental context. Typically for each compound in life span studies, we will need ~20 SMA animals in the treated group; ~20 SMA animals in the vehicle treated group; ~20 SMA animals in the untreated group. If we can administer the compound in aqueous solution without expedient, the vehicle and untreated groups might be combined, as these should have identical survival. Therefore, no more than 80 SMA animals will be needed per compound.

    Example #3

    Aim 2: Intensity signal data will be transformed into log values and then modeled by longitudinal methods (reference cited). Specifically, the composite difference in mean intensity signals over time between the bi-specific T cells vs. control groups is assumed to be 2.8 logs with a composite standard deviation of 2.2 logs. Furthermore, we will assume at least five repeated measurements per mouse after T cell infusion and a within-mouse intra-correlation coefficient equal to 0.50. Thus, a sample size of 10 mice per group will provide at least 80% power to detect the above difference between treated versus control group with a 5% significance level. Log-rank test will be used to compare the survival distribution between groups.

    VAS: Animal numbers are based on the requirement to perform each experiment (power and sample size calculations are described in the Research Strategy), which includes an independent experimental repeat.

    Example #4

    Aim 1: Statistical considerations: In our preliminary studies consisting of this same cohort of DFUs (n=100) and utilizing 16S rRNA sequencing, we were able to detect dimensions of DFU microbiome, including microbial diversity, that were significantly associated with DFU outcomes. We therefore anticipate that the sample size will provide sufficient power to detect significant differences using metagenomic sequencing, as this is a more sensitive and less-biased assay of microbial identification and diversity.

    Aim 3: Random Forests, a machine learning approach for classification, will be used to determine which metagenome features differentiate groups (e.g., antibiotics vs. no antibiotics; pre- vs. post-debridement). Random Forest uses a bootstrap method to assess test error, ideal in our situation of small sample size (n=18). For diversity and load measures, significance between groups will be assessed using non-parametric Wilcoxon rank-sum tests.

News, Notices, and Blog Posts

On January 29, 2016, NIH Deputy Director of Extramural Research Dr. Mike Lauer published a series of Open Mike blog posts on each of the four focus areas of the rigor and transparency policy for research grant and career development award applications: Scientific Premise in NIH Grant ApplicationsScientific Rigor in NIH Grant ApplicationsConsideration of Relevant Biological Variables in NIH Grant Applications, and Authentication of Key Biological and/or Chemical Resources in NIH Grant Applications

On December 17, 2015, the NIH published guide notice NOT-OD-16-034 to notify the community of upcoming requirements for formal instruction in rigorous experimental design and transparency to enhance reproducibility. This notice applies to institutional training grants (D43, T15, T32/TL1, T34, T35, T36, T37, T90/R90, and U2R), institutional career development awards (K12/KL2), and individual fellowships (F05, F30, F31, F32, F37, F38, and FI2).

On December 15, 2015, the NIH published guide notice NOT-OD-16-031 to notify the community of updates to the PHS Research Performance Progress Report (RPPR) instructions to address rigor. New questions about rigor can be found under Accomplishments, Section B.2 and B.6. 

On October 13, 2015, the NIH published guide notices outlining updates to form instructions for applications due in 2016, including an overview (NOT-OD-16-004), as well as details on Implementing Rigor and Transparency in NIH & AHRQ Research Grant Applications (NOT-OD-16-011) and Implementing Rigor and Transparency in NIH & AHRQ Career Development Award Applications (NOT-OD-16-012).

On October 30, 2015, NIH Deputy Director of Extramural Research Dr. Mike Lauer published an Open Mike blog post on Bolstering Trust in Science through Rigorous Standards. NIH OER has also released a staff training module that provides a General Policy Overview on enhancing reproducibility through rigor and transparency.

Blog Entry on "Listening to Our Stakeholders On Considering Sex as a Biological Variable, Rock Talk and ORWHLink to External Site Rockey & Clayton, 05/20/2015 

Analysis of Public Comments: "NIH Request for Information: Consideration of Sex as a Biological Variable in Biomedical ResearchLink to External Site ORWH, 05/20/2015 

On June 9th, 2015, the NIH published guide notices Enhancing Reproducibility through Rigor and Transparency (NOT-OD-15-103), as well as Considering Sex as a Biological Variable in NIH-funded Research (NOT-OD-15-102). See related blog posts by Dr. Larry Tabak and Dr. Sally Rockey on Rock Talk Link to External Site, and Dr. Janine Clayton on the ORWH Director's Page Link to External Site