Notice of Intent to Publish Two RFAs: 1000 Genomes Project Data Processing and 1000 Genomes Project Dataset Analysis

Notice Number: NOT-HG-08-014

Key Dates
Release Date: September 25, 2008

Issued by
National Human Genome Research Institute (NHGRI) ( )

The National Human Genome Research Institute (NHGRI) plans to issue two Requests for Applications (RFAs) in the Fall of 2008 to support (a) the processing of the 1000 Genomes Project data and (b) certain subsequent analyses of the completed dataset. The 1000 Genomes Project will provide a resource to support genome-wide association studies and other human studies by finding most DNA sequence variants in about 1200-1500 anonymized individuals. The full-scale study should produce about 20 Terabases of sequence data. Data-processing pipelines are being set up to use these sequence data to provide several data types, such as genotype calls for SNPs and structural variants and their linkage disequilibrium patterns, and to release these data regularly. These pipelines will need to be improved and monitored, and new processing steps will need to be added. When the complete dataset is produced, it will need to be characterized in several ways and to have several types of global analyses done, and tools will need to be developed to allow the research community to use the data.

(a) The RFA for data processing would fund up to about six awards in fiscal year 2009. This RFA will solicit applications to continue to develop, evaluate, and implement the methods needed to produce the data types, monitor the quality of the data, integrate the data types, develop the tools needed to work with the data, and develop new processes as needed to produce the final project dataset. This work should be performed as part of the highly collaborative international consortium for the 1000 Genomes Project ( and ). This Notice is being provided to allow potential applicants sufficient time to develop meaningful collaborations and responsive projects. This RFA is expected to be published in the fall of 2008 with an expected receipt date in late fall 2008. This RFA utilizes the U01 mechanism.

This Notice encourages investigators with expertise and insights into the area of data processing of large sequence datasets to begin consider applying for this new RFA.

(b) The RFA for the analysis of the complete dataset would fund up to about ten awards in fiscal year 2010. This RFA will solicit applications to characterize and analyze the full dataset, such as for allele frequency distribution and signals of natural selection, and to develop the tools needed to work with the data and apply them to other studies such as genome-wide association studies. This Notice is being provided to allow potential applicants sufficient time to develop meaningful collaborations and responsive projects. This RFA is expected to be published in the fall of 2008 with an expected receipt date in the late spring of 2009. This RFA utilizes the U01 mechanism.

This Notice encourages investigators with expertise and insights into the area of data analyses of large sequence datasets to begin consider applying for this new RFA.

Researchers at U.S. institutions are eligible to apply. Neither RFA will support the use of 1000 Genomes data for the analysis of the genetics of specific human diseases or other phenotypes.


Interested parties may contact:

Lisa D. Brooks, Ph.D.
5635 Fishers Lane, Suite 4076
Bethesda, MD 20892-9305
(301) 435-5544