Request for Information (RFI): Input on Direction of the Second Phase of the "Illuminating the Druggable Genome" Program

Notice Number:


Key Dates

Release Date: November 18, 2014

Response Date: December 9, 2014

Related Announcements




Issued by

National Institutes of Health (NIH)

Office of Strategic Coordination (Common Fund)


This Request for Information (RFI) is to solicit comments and ideas for a possible expansion of the “Illuminating the Druggable Genome” program from a pilot phase to a full-fledged program.


The human genome has revealed a great deal about the human proteome, though significant portions of the latter remain unannotated. The “Druggable Genome” has been described as “…the subset of the ~30,000 genes in the human genome that express proteins able to bind drug-like molecules.” (Hopkins, AL, and Groom, CR. The druggable genome. Nature Reviews/Drug Discovery 1:727, 2002). About 1,000-6,000 proteins are considered “druggable” due to their sequence similarity to established or putative drug targets (Ovington, JP, Al-Lazikani, B, and Hopkins, AL. Nature Reviews Drug Discovery 5:993, 2006). However, a sub-set of ~400 proteins within this category is still largely uncharacterized. Many proteins that comprise this ‘dark matter’ of the Druggable Genome are members of the G-protein coupled receptor (GPCR), nuclear receptor (NR), ion channel and protein kinase families. Further investigation is warranted to demonstrate biological importance of understudied members of these protein families and the potential to be able to control their functions, such as through pharmacological interventions.

In FY14 the NIH funded a pilot program, “Illuminating the Druggable Genome” (IDG), to increase the understanding of the properties and functions of unannotated proteins within four of the most commonly drug-targeted protein families, the GPCRs, NRs, ion channels, and protein kinases. To evaluate the potential for this program, the NIH funded a Knowledge Management Center (KMC) for 2 years under RFA-RM-13-011 and a consortium of 7 projects to focus on the Adaptation of Scalable Technologies to Illuminate the Druggable Genome (TechDev) for 3-years under RFA-RM-13-010. During this pilot phase, the KMC is expected to develop an integrated informatics solution that enables data accrual, storage, cataloging, analysis, and dissemination of standardized/annotated information related to unannotated proteins in the four gene families listed above. The aim is to create a community informatics resource that enables investigations into the functions and potential role(s) in physiology and disease of proteins of previously unknown function by identifying gaps in known data and allowing the prioritization of candidates within different technology platforms. The KMC aims to bridge clinical, biological, chemical and genomic data to prioritize targets from within these privileged target families for further experimental evaluation and analyses by the broader scientific community. To accomplish this, the KMC will integrate disease, pathway, protein, gene, chemical, bioactivity, drug discovery and clinical status databases and documents, supported by innovative algorithmic platforms, knowledge management tools and user interfaces. During the pilot phase of the TechDev consortium, the investigators will assess if existing medium- and high-throughput technologies can be scaled to use across large numbers of the unannotated proteins in the four gene families in order to generate functional data and to illuminate physiological and/or pathophysiological relevance. Approaches currently being explored by the TechDev consortium include cellular and organismal phenotyping of gene and gene X gene deletion, cheminformatics-guided small molecule screening and medicinal chemistry, as well as mass spectrometric and functional elucidation of protein interacting partners and signaling networks.

The expansion of the IDG into a full-fledged program could entail an expanded informatics endeavor, systematic medium- and high-throughput experimental efforts to generate reliable and reproducible data around entire unannotated classes of proteins, the development and dissemination of new technologies and tools to accomplish these goals, and in-depth projects focusing on demonstrating pathophysiological relevance in mammalian systems of prioritized proteins. Key to the success of this endeavor would be determining the most effective research and informatics techniques and technologies that should be utilized to determine which proteins have pivotal roles in disease pathology and may serve as potential therapeutic targets. The NIH is interested in the community’s input on potential direction of expanded activities in an IDG program that expanded beyond the four classes of genes or beyond the human genome, included deeper studies of recalcitrant members of the four classes, or demonstrated disease-modifying roles of previous unannotated proteins in mammalian systems.

Information Requested

In order to maximize the impact of this planned program and community resource, we seek input on what may be the main areas of focus for the full-fledged "Illuminating the Druggable Genome" program following the pilot stage. Responses may include but are not limited to the following categories:

  • Defining experimental data needs:
    • Experimental data types most needed for elucidating the function of unannotated proteins.
    • Classes of data that would be required to advance a newly illuminated protein forward for further, more detailed study.
    • Feasibility and opportunity of scalable technologies that could be used to generate IDG-relevant data.
    • The extent to which the consortium could focus on generating knowledge of protein function in animal models vs. human-derived sample or cell line models.
    • Ways to identify multiple protein targets that may act together and whose simultaneous targeting could lead to therapeutic benefit or adverse effects through pleiotropic compounds.
    • Strategies for data characterization to ensure that existing and newly generated datasets are trustworthy and reliable.
  • Knowledge management prioritization:
    • Informatics efforts, approaches, methods or tools that can help to prioritize which potentially druggable targets warrant further investment.
    • Data around new protein function that can be leveraged to identify potential therapeutic opportunities or understand drug toxicity.
  • Broadened scope versus four protein family focus:
    • Potential expansion beyond the present focus on four families of human-genome encoded GPCRs, NRs, ion channels, and protein kinases.
    • Feasibility of and gaps in the current technologies necessary for such expansion.
  • Defining successes:
    • Prioritization and relative amounts of deliverables for the different IDG projects (e.g., broad data sets providing evidence for additional studies vs. demonstration of disease relevance in mammalian disease models).
    • Audacious goals with common themes that can jumpstart translational studies,.
  • Identifying expertise of IDG investigators and users:
    • Needed types of expertise for a complex endeavor such as IDG and how best to incorporate them across its lifecycle.
    • Backgrounds of the various potential users of IDG-generated and curated data and web portal features to accommodate them.
    • Training opportunities the IDG may undertake to enhance its impact on the scientific and medical communities.
  • Other comments, suggestions, or considerations relevant to this RFI.

Submitting a Response

All responses must be submitted to December 9, 2014. Please include the Notice number NOT-RM-14-018 in the subject line. Response to this RFI is voluntary. Responders are free to address any or all of the categories listed above. The submitted information will be reviewed by the NIH staff. Submitted information will be considered confidential.

This request is for information and planning purposes only and should not be construed as a solicitation or as an obligation on the part of the Federal Government, the NIH. The NIH does not intend to make any awards based on responses to this RFI or to otherwise pay for the preparation of any information submitted or for the Government's use of such information.

The NIH will use the information submitted in response to this RFI at its discretion and will not provide comments to any responder’s submission. However, responses to the RFI may be reflected in future solicitation(s). The information provided will be analyzed and may appear in reports. Respondents are advised that the Government is under no obligation to acknowledge receipt of the information received or provide feedback to respondents with respect to any information submitted. No proprietary, classified, confidential, or sensitive information should be included in your response. The Government reserves the right to use any non-proprietary technical information in any resultant solicitation(s).


Please direct all inquiries to:

Aaron C. Pawlyk, Ph.D.
National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)
Telephone: 301-451-7299