Announcer: From the National Institutes of Health in Bethesda, Maryland, this is All About Grants.
Megan Columbus: Here we are with another edition of All About Grants. I’m your host Megan Columbus from the Division of Communications in NIH’s Office of Extramural Research. Today we’ll be talking to Dr. Sally Amero about the NIH review criteria and scoring.
Megan: Welcome to the show Sally.
Sally Amero: Well thank you Megan. It’s a pleasure to be here.
Megan: Can you tell us a little bit about your background and your role here at NIH?
Sally: Well let’s see, so I am the NIH Review Policy Officer, which means that I write the official policy for the peer review program.
Megan: I thought we should start off by talking about the ultimate goal of peer review, which is really to evaluate the scientific merit of an application. I find when I talk to people in the field that there is some confusion because they think that review groups make recommendations to NIH about funding, and they don’t. They evaluate the scientific merit of the application, isn’t that right?
Sally: Correct, so the peer review system is a two-tiered system, and today we are going to be talking mainly about the initial peer review—the first level of review. That’s what most people refer to as study sections. The second level of review is done by Advisory Councils, and they have different roles. So the first level of review assesses the scientific and technical merit, which we have interpreted more recently as overall impact of the proposed projects, and the Council level of review makes recommendations on funding.
Megan: And when we talk about peer review, we really do mean peer review. It’s bringing in scientists, and many of whom are NIH grantees, to come and evaluate from the field these applications. So just to be clear this is not talking about NIH staff reviewing those applications.
Sally: Correct. No member of the extramural NIH staff may serve as a reviewer in our peer review system.
Megan: There are five scored criteria. Can you tell us what those are?
Sally: Well, it depends on the application that you have in mind. Most people are familiar with our standard criteria so to speak, and those apply to research projects. Those are significance, investigator, innovation, approach, and environment. But we have other cassettes of five criteria that apply to other types of mechanisms, like fellowships, career development awards, training mechanisms.
Megan: And as an applicant is reviewing a funding opportunity announcement for the various programs, the review criteria should be clear in those funding opportunity announcements.
Sally: Correct. So any funding opportunity announcement must specify the review criteria that will be used by the reviewers in evaluating applications submitted in response to that announcement.
Megan: So Sally, in addition to the five scored criteria, what else do reviewers assess?
Sally: Well, we have several categories of factors that reviewers are asked to consider. So one them are usually a set of five criteria that receive individual criterion scores (that’s what we mean by the scored criteria). We’ll get to that in just a minute. We’ll talk about the scoring system. But in addition, there are additional review criteria for each mechanism that are not given individual criterion scores because they don’t apply to a proposed project. So an example there would be human subjects protection. Where it is present, reviewers are asked to assess it as part of the overall merit in impact, but if the project does not involve human subjects, we can’t require a score there. So both the scored review criteria and the criteria that do not get individual scores are considered when the reviewer is giving an overall score for the entire project. In addition, we ask the reviewers to give us advice on the number of policy and administrative issues that do not factor into their consideration of the overall score. A perfect example there would be a recommendation on the budget. So normally we do not consider the budget as being a scientific factor. It’s more of an administrative factor, so it does not factor into the overall score.
Megan: Recently NIH has aligned the grant applications with those scored criteria. Meaning that applicants as they develop their application and their research plan are writing directly to respond to the criteria: significance, investigators, innovation, approach, and environment. As reviewers then approach the application, can we move a little bit to the criterion scoring and the intent of the criterion scoring and the relationship between that criterion scoring and that overall impact score?
Sally: Well, about a year ago in May of 2009, we adopted a new scoring system that asked reviewers to offer integer scores starting from one through nine. And this one through nine scoring system applies both to the overall impact score and to the individual criterion scores. The criterion scores were introduced at that time to increase the transparency and the information content from the reviewer. Particularly for applications that are not discussed it was felt that this would add more information to the summary statement.
Megan: So the assigned reviewers will provide scores for each of those individual review criteria and then how do they consider those review criteria as they’re considering the overall impact score? I know I’ve gotten lots of questions in the past of, “Hey these don’t average up.” But the intent is not to average these scores together right? Can you talk to us a little bit about guidance to reviewers about weighting these scores?
Sally: Well, we tell reviewers that they should weigh the different criteria as they see fit. So they might view a particular strength as being the driving factor in their consideration of a score, but another reviewer might consider a weakness to be more important to them. So we do not prescribe formulae or ways to calculate to arrive at their score. Another really important consideration to keep in mind, and you mentioned this, is that the criterion scores are offered by the assigned reviewers. So each application has at least three reviewers assigned to give it a very thorough analysis and evaluation, but there are more people on the panel then just those three. So everyone who does not have a conflict of interest is asked to offer a final impact score, but only the assigned reviewers give criterion scores. So it could happen that the other people on the panel are listening to the discussion and the points from the assigned reviewers but may not agree. So the numbers that are reported out on the summary statement for criterion scores are not necessarily the reflection of the entire panel, and they are not necessarily the final consensus of the entire panel.
Megan: Well and I think one of things that NIH is doing to provide a little bit more information on how people arrive at that overall impact score is we are now going to be asking those assigned reviewers to include a paragraph talking about that overall impact score, isn’t that right.
Sally: That’s right. Reviewers had been asked to frame their comments on the overall impact section in the form of bullets, but we’ve decided that the reviewers might need a format to synthesize their thoughts, so we are providing them space for a paragraph to describe the factors that led to their overall impact score.
Megan: So when reviewers are scoring the application, the assigned reviewers give those criterion scores on a one to nine scale and everybody is scoring the overall impact in that review group who don’t have a conflict of interest on a one to nine scale. When applicants receive a summary statement, however, the summary statement is a ten to ninety scale, why is that?
Sally: For each application there is only one final score that is reported to Council and is recorded as the official outcome of review. So the way that it is calculated is by averaging all of the final impact scores from all of the eligible reviewers and multiplying by ten. So the final score can start from 10 and go through 90.
Megan: So Council and NIH staff and the applicant will see that final score. For some types of applications they may also get a percentile. Why do we give applications percentiles, and how is that calculated?
Sally: We noticed that certain study sections adopt certain scoring behaviors over time, and the percentiling mechanism is one way to normalize across different study sections with different behaviors.
Megan: So that’s to compensate for the easy graders versus the harder graders?
Sally: Right, Right, Right.
Megan: One thing that people have asked me about is the distinction between impact and significance. Can you talk about that a little bit to provide some clarity?
Sally: I would be glad to because it’s a question that I hear a lot also. So impact is the entire package. That’s an umbrella consideration that everything else should fold up into. Significance is only one component that reviewers are asked to think about for research type grants when they are offering an assessment of impact. So the way I like to think about this is as the project is presented with these investigators and with the approach that they proposed, what is this project going to do for their research field? And you’ll notice that we focused the reviewer on the research field involved, we really do not want reviewers to start comparing the importance of different diseases or conditions or fields, so we focused them right in on their field.
Megan: For impact.
Sally: For impact, that’s right. Now let’s talk about significance. Significance is a little bit different. Significance is one of the five review criteria that get an individual criterion score, but significance is let’s just assume the approach is going to work, the investigators are great, they are in a great environment and they are doing something really innovative, let’s just grant them all of that. For significance, is it worth doing? So if in a perfect world does this question, hypothesis, endeavor, whatever have intrinsic value to do? So that’s how I like to think of them. One is sort of a if everything goes according to plan, and then the other one is a reality check for impact.
Megan: When somebody receives a summary statement and they see their scores, what’s the best thing they should do if they need help interpreting what those score mean or how the discussion went at that meeting?
Sally: Program officers in the funding ICs attend or listen to or observe the study sections where their applications are evaluated, so the best thing to do is to contact your program officer in the IC where your application was assigned for funding consideration.
Megan: And IC being the Institute or Center.
Sally: Right, Right.
Megan: And you can find the name of that program officer in the eRA Commons account. So Sally your explanation of impact versus significance makes me think about fellowship and training types of applications. What does impact look like for that kind of application?
Sally: One may look at the definition of impact and significance and think that is working well for research grants, but how do we define these for say a fellowship application or a career training award? And if you look in our criteria we have a nice matrix of review criteria up on our website. You’ll see that we have redefined impact for each type of mechanism that we have. The potential for a candidate to make an important contribution to a field we feel is impact for a training type of mechanism.
Megan: Thank you for joining us today.
Sally: Well thank you.
Megan: For NIH and OER this is Megan Columbus.
Announcer: To view the review criteria at a glance chart, visit the Office of Extramural Research’s website at grants.nih.gov and search for “review criteria.” That is G-R-A-N-T-S dot N-I-H dot G-O-V.