GRADE Series - Guest Editors, Sharon Straus and Sasha Shepperd
GRADE guidelines: 1. Introduction—GRADE evidence profiles and summary of findings tables

https://doi.org/10.1016/j.jclinepi.2010.04.026Get rights and content

Abstract

This article is the first of a series providing guidance for use of the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) system of rating quality of evidence and grading strength of recommendations in systematic reviews, health technology assessments (HTAs), and clinical practice guidelines addressing alternative management options. The GRADE process begins with asking an explicit question, including specification of all important outcomes. After the evidence is collected and summarized, GRADE provides explicit criteria for rating the quality of evidence that include study design, risk of bias, imprecision, inconsistency, indirectness, and magnitude of effect.

Recommendations are characterized as strong or weak (alternative terms conditional or discretionary) according to the quality of the supporting evidence and the balance between desirable and undesirable consequences of the alternative management options. GRADE suggests summarizing evidence in succinct, transparent, and informative summary of findings tables that show the quality of evidence and the magnitude of relative and absolute effects for each important outcome and/or as evidence profiles that provide, in addition, detailed information about the reason for the quality of evidence rating.

Subsequent articles in this series will address GRADE’s approach to formulating questions, assessing quality of evidence, and developing recommendations.

Introduction

Key Points

  • Grading of Recommendations Assessment, Development, and Evaluation (GRADE) offers a transparent and structured process for developing and presenting summaries of evidence, including its quality, for systematic reviews and recommendations in health care.

  • GRADE provides guideline developers with a comprehensive and transparent framework for carrying out the steps involved in developing recommendations.

  • GRADE’s use is appropriate and helpful irrespective of the quality of the evidence: whether high or very low.

  • Although the GRADE system makes judgments about quality of evidence and strength of recommendations in a systematic and transparent manner, it does not eliminate the inevitable need for judgments.

In this, the first of a series of articles describing the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach to rating quality of evidence and grading strength of recommendations, we will briefly summarize what GRADE is, provide an overview of the GRADE process of developing recommendations, and present the endpoint of the GRADE evidence summary: the evidence profile (EP) and the summary of findings (SoFs) table. We will provide our perspective on GRADE’s limitations and present our plan for this series.

Section snippets

What is GRADE?

GRADE offers a system for rating quality of evidence in systematic reviews and guidelines and grading strength of recommendations in guidelines. The system is designed for reviews and guidelines that examine alternative management strategies or interventions, which may include no intervention or current best management. In developing GRADE, we have considered a wide range of clinical questions, including diagnosis, screening, prevention, and therapy. Most of the examples in this series are

Purpose of this series

This series of articles about GRADE is most useful for three groups: authors of systematic reviews, groups conducting HTAs, and guideline developers. GRADE suggests somewhat different approaches for rating the quality of evidence for systematic reviews and for guidelines. HTA practitioners, depending on their mandate, can decide which approach is more suitable for their goals.

The GRADE approach is applicable irrespective of whether the quality of the relevant evidence is high or very low. Thus,

The GRADE process—defining the question and collecting evidence

Figure 1 presents a schematic view of GRADE’s process for developing recommendations in which unshaded boxes describe steps in the process common to systematic reviews and guidelines and the shaded boxes describe steps that are specific to guidelines. One begins by defining the question in terms of the populations, alternative management strategies (an intervention, sometimes experimental and a comparator, sometimes standard care), and all patient-important outcomes (in this case four) [12].

The GRADE process—rating evidence quality

In the GRADE approach, randomized controlled trials (RCTs) start as high-quality evidence and observational studies as low-quality evidence supporting estimates of intervention effects. Five factors may lead to rating down the quality of evidence and three factors may lead to rating up (Fig. 2). Ultimately, the quality of evidence for each outcome falls into one of four categories from high to very low.

Systematic review and guideline authors use this approach to rate the quality of evidence for

The GRADE process—grading recommendations

Guideline developers (but not systematic reviewers) then review all the information to make a final decision about which outcomes are critical and which are important and come to a final decision regarding the rating of overall quality of evidence.

Guideline (but not systematic review) authors then consider the direction and strength of recommendation. The balance between desirable and undesirable outcomes and the application of patients’ values and preferences determine the direction of the

The endpoint of the GRADE process

The endpoint for systematic reviews and for HTA restricted to evidence reports is a summary of the evidence—the quality rating for each outcome and the estimate of effect. For guideline developers and HTA that provide advice to policymakers, a summary of the evidence represents a key milestone on the path to a recommendation.

The GRADE working group has developed specific approaches to presenting the quality of the available evidence, the judgments that bear on the quality rating, and the

What is the difference between an EP and a SoFs table?

An EP (Table 1) includes a detailed quality assessment in addition to a SoFs. That is, the EP includes an explicit judgment of each factor that determines the quality of evidence for each outcome (Fig. 2), in addition to a SoFs for each outcome. The SoF table (Table 2) includes an assessment of the quality of evidence for each outcome but not the detailed judgments on which that assessment is based.

The EP and the SoF table serve different purposes and are intended for different audiences. The

More than one systematic review may be needed for a single recommendation

Figure 1 illustrates that evidence must be summarized—the summaries ideally coming from optimally conducted systematic reviews—for each patient-important outcome. For each comparison of alternative management strategies, all outcomes should be presented together in one EP or SoFs table. It is likely that all studies relevant to a health care question will not provide evidence regarding every outcome. Figure 1, for example, shows the first study providing evidence for the first and second

A single systematic review may need more than one SoFs table

Systematic reviews often address more than one comparison. They may evaluate an intervention in two disparate populations or examine the effects of a number of interventions. Such reviews are likely to require more than one SoFs table. For example, a review of influenza vaccines may evaluate the effectiveness of vaccination for different populations, such as community dwelling and institutionalized elderly patients or for different types of vaccines.

An example of an EP

Table 1 presents an example of a GRADE EP addressing the desirable and undesirable consequences of use of antibiotics for children with otitis media living in high- and middle-income countries. The most difficult judgment in this table relates to the quality of evidence regarding adverse effects of antibiotics. In relative terms, the increases in adverse effects were reasonably consistent across trials. The trials, however, had very different rates of adverse effects (from 1% to 56%).

An example of a SoFs table

Table 2 presents a SoF table in the format we recommend on the basis of pilot testing, user testing, and evaluations [10], [12], [13]. The Appendix presents an explanation of the terms found in the SoF table and the EP.

A SoF table presents the same information as the full EP, omitting the details of the quality assessment and adding a column for comments. The logic of the order of the columns is their importance—more important in the first columns and less important in the later. Aside from a

Modifications of GRADE

Some organizations have used modified versions of the GRADE approach. We recommend against such modifications because the elements of the GRADE process are interlinked because modifications may confuse some users of evidence summaries and guidelines, and because such changes compromise the goal of a single system with which clinicians, policy makers, and patients can become familiar.

GRADE’s Limitations

Those who want to use GRADE should consider five important limitations of the GRADE system. First, as noted previously, GRADE has been developed to address questions about alternative management strategies, interventions, or policies. It has not been developed for questions about risk or prognosis, although evidence regarding risk or prognosis may be relevant to estimating the magnitude of intervention effects or providing indirect evidence linking surrogate to patient-important outcomes.

Where from here

The next article in this series will describe GRADE’s approach to framing the question that a systematic review or guideline is addressing and deciding on the importance of outcomes. The next set of articles in the series will address in detail the decisions required to generate EPs and SoF tables, such as those presented in Table 1, Table 2. The series will then address special challenges related to diagnostic tests and resource use and the process of going from evidence to recommendations.

References (19)

There are more references available in the full text version of this article.

Cited by (6424)

View all citing articles on Scopus

The Grading of Recommendations Assessment, Development, and Evaluation (GRADE) system has been developed by the GRADE Working Group. The named authors drafted and revised this article. A complete list of contributors to this series can be found on the Journal of clinical Epidemiology website.

View full text