An existing dataset may be constructed of different types of data including but not limited to survey data, demographic information, health information, genomic information, etc. Also included would be data to be derived from existing samples of cells, tissues, or other types of materials that may have been previously collected for a different purpose or research question but will now be used to answer a new research question. In general, these will be studies meeting the NIH definition for clinical research with a prospective plan to analyze existing data and/or derive data from an existing resource and where no ongoing or future contact with participants is anticipated.
Yes. You can propose a study or analyses of an existing dataset where the cohort is limited in sex/gender, racial, and/or ethnic participation. However, you should justify why this dataset is useful in the proposed scientific context, particularly if the dataset does not reflect the population of the disease or condition under study. Some factors that can be considered as part of the justification include the nature of the scientific question, a requirement for data provided by the cohort, or addressing a gap in knowledge.
The NIH provides forms with the different application packages for completing information on sex/gender, race, and ethnicity. We are transitioning to a modified layout of the forms starting with competing applications. For additional details see this Guide Notice. As noted above, if you are conducting research with an existing cohort or dataset, you would use the Cumulative Inclusion Enrollment Report rather than the Planned Enrollment Report.
You should provide the sex/gender, race, and ethnicity information only for the data points you will use from the existing dataset or resource. You would provide information for the entire dataset or resource if you were analyzing data from all individuals in that dataset or resource. If your project is limited to analyzing data from only a subset of subjects in the existing dataset or resource, then you would complete the inclusion table using data only from those (the subset of) subjects included in your analysis. For example, if you want to analyze data from 2000 individuals in a large population based survey that includes 10 million individuals, you would provide the sex/gender, race, and ethnicity information for the 2000 individuals you plan to analyze. If you were analyzing information about all 10 million participants, you would provide the sex/gender, race, and ethnicity information for all 10 million.
If you are proposing a study that will include both an existing dataset and recruitment of new participants, you should provide separate inclusion forms for the existing dataset and the participants to be prospectively recruited. The existing dataset sample can be provided on the Cumulative Inclusion Enrollment Report as described above. The participants to be prospectively recruited should be accounted for on a Planned Enrollment Report with the competing application/proposal and actual recruitment numbers will be reported to the NIH at least annually on the Cumulative Inclusion Enrollment Report.
If you are proposing a study that will include multiple existing datasets/resources, you may submit the datasets/resources on separate Cumulative Inclusion Enrollment Reports or consolidate onto one Cumulative Inclusion Enrollment Report. The decision to use separate forms or one consolidated form should be considered in the context of the scientific goals of the study and whether there is value in providing separate forms to illustrate the breakdown of sex/gender, race, and ethnicity information for each dataset/resource. Also, please be sure to check with FOA (Funding Opportunity Announcement) you are applying to in case there is additional guidance on this issue.