Objective Guidelines for screening and diagnosis of gestational diabetes mellitus (GDM) have been updated in the past several years, and various inconsistencies exist across these guidelines. Moreover, the quality of these updated guidelines has not been clarified. We thus conducted this systematic review to evaluate the relationship between the quality and detailed recommendations of these guidelines.
Data sources The Guidelines International Network Library, the National Institute for Health and Clinical Excellence (NICE) database, the Medline database, the Embase and the National Guidelines Clearinghouse were searched for guidelines containing recommendations on screening and diagnosis strategies for GDM between 2009 and November 2018.
Methods Guidelines included a target group of women with GDM, and contained recommendations for screening and diagnostic strategies for GDM were included in the present systematic review. Reviewers summarised recommendations on screening and diagnosis strategies from each guideline and rated the quality of guidelines by using the Appraisal of Guidelines Research and Evaluation (AGREE) criteria.
Results A total of 459 citations were collected by the preliminary literature selection, and 16 guidelines that met the inclusion criteria were assessed. The inconsistencies of the guidelines mainly focus on the screening process (one step vs two step) and criteria of oral glucose tolerance test (OGTT) (International Association of Diabetes and Pregnancy Study Groups [IADPSG] vs CarpenterandCoustan). Guidelines with higher AGREE scores usually recommend a one-step OGTT strategy with IADPSG criteria between 24 and 28 gestational weeks, and the majority of these guidelines likely to select evidence by Grading of Recommendations Assessment, Development and Evaluation criteria.
Conclusions The guidelines of WHO-2013, NICE-2015, American Diabetes Association-2018, Endocrine Society-2013, Society of Obstetricians and Gynaecologists of Canada-2016, International Federation of Gynecology and Obstetrics-2015, American College of Obstetricians and Gynecologists-2018, United States Preventive Services Task Force-2014 and IADPSG-2015 are strongly recommended in the present evaluation, according to the AGREE II criteria. Guidelines with higher quality tend to recommend a one-step 75 g OGTT strategy with IADPSG criteria between 24 and 28 gestational weeks.
- gestational diabetes mellitus
- screening strategies
- diagnostic criteria
- appraisal of guidelines research and evaluation
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
- gestational diabetes mellitus
- screening strategies
- diagnostic criteria
- appraisal of guidelines research and evaluation
Strengths and limitations of this study
This systematic review present an overview of current clinical guidelines on screening and diagnosis of gestational diabetes mellitus (GDM), and provide an evaluation of quality of these guidelines.
We used Appraisal of Guidelines Research and Evaluation (AGREE II), an international, rigorously developed and validated instrument, to evaluate the guidelines.
The present review mainly focuses on guidelines that provided recommendations on the screening and diagnosis of GDM, and did not evaluate other fields, such as therapy, monitoring and obstetric consideration of GDM.
The AGREE II instrument considered all six domains of guidelines equally, but not weighted by their importance in guideline development.
Only guidelines and recommendations published in the English language were included in the present review.
Gestational diabetes mellitus (GDM) is a common disorder during pregnancy that affects an increasing number of pregnant women in the global population,1 2 likely linked to the increased obesity epidemic. GDM increases the risk of short-term and long-term complications in pregnant women, including pre-eclampsia, the rate of caesarean section, miscarriage and diabetes later in life. Moreover, offspring of mothers with GDM is more likely to have respiratory distress syndrome and hypoglycaemia during the neonatal period,3 and develop diabetes,4 obesity4–6 and metabolic disorders7 later in life.
In 1964, O’Sullivan and Mahan first developed the two-step method and oral glucose tolerance test (OGTT) for the diagnosis of GDM during pregnancy,8 based on the risk of maternal type 2 diabetes mellitus (DM) later in life.9 The two-step method for GDM screening consists of a first-step glucose challenge test (GCT) and a second-step OGTT. GCT test is based on oral intake of 50 g glucose solution followed by venous glucose examination 1 hour later. Then, women whose glucose levels meet or exceed screening threshold undergo a 100 g, 3 hours or a 75 g, 2-hour-diagnostic OGTT. In this approach, GDM is diagnosed in women who have two or more abnormal values (5.3–10.0–8.6 mmol/L at fasting, 1 hour and 2 hours postprandial, namely Carpenter and Coustan [C-C] criteria) on OGTT.
This method and criteria have been adopted by many organisations, including the National Diabetes Data Group (NDDG)10 and American Diabetes Association (ADA),11 with some modifications,10–13 and was the standard method for the diagnosis of GDM for more than two decades.
In 2008, the Hyperglycemia and Adverse Pregnancy Outcomes (HAPO) Study showed that adverse neonatal outcomes were associated with mild hyperglycaemia, even though the glucose levels did not meet the old criteria of GDM.14 To control for the potential influence of hyperglycaemia on the fetus/neonate, the International Association of Diabetes and Pregnancy Study Groups (IADPSG) recommended a one-step 75 g OGTT testing (without a 50 g GCT before) and reduced the diagnostic cut-off of the OGTT (5.1–10.0–8.5 mmol/L at fasting, 1 hour and 2 hours postprandial) in 2010.15 In this approach, GDM is diagnosed in women who have one abnormal value.
Later on, many organisation guidelines including the ADA,16 WHO2 and the International Federation of Gynecology and Obstetrics (FIGO)17 adopted the IADPSG criteria in the screening and diagnosis of GDM. However, several guidelines including the American College of Obstetricians and Gynecologists (ACOG) Practice Bulletin,18 the National Institutes of Health (NIH) consensus statement19 and Society of Obstetricians and Gynaecologists of Canada (SOGC)20 did not support the IADPSG criteria.
Inconsistencies in the GDM diagnostic strategy between different guidelines have led to challenges in making clinical diagnosis. Therefore, the purpose of this systematic review is to present an overview of current clinical guidelines on screening and diagnosis of GDM, and to provide an integrated insight into their quality using the Appraisal of Guidelines Research and Evaluation (AGREE) instrument.
A systematic search (last updated in November 2018) was performed to retrieve relevant guidelines regarding the management of GDM. The guidelines were identified using computer searches of the Guidelines International Network Library, the National Institute for Health and Clinical Excellence (NICE) database, the Medline database, the Embase and the National Guidelines Clearinghouse. Searches were limited to guidelines from the USA, Canada, UK, Australia and New Zealand, and international guidelines in the English language. A search for websites of guideline development organisations was also performed (see online supplementary table S1).
Supplementary file 1
Patient and public involvement
Patients and or public were not involved.
Articles were considered if they met the Institute of Medicine definition of clinical practice guidelines (CPGs). The Institute of Medicine defines CPGs as ‘systematically developed statements to assist practitioner and patient decisions about appropriate healthcare for specific clinical circumstances.’ We included guidelines if they: (1) included a target group of patients with GDM; (2) contained recommendations for screening and diagnostic strategies for GDM; (3) were produced on behalf of a national or international medical specialty society and (4) were written in English. Guidelines that were developed before 2009 and not been updated before November 2018 would be excluded since they may be out of date. We excluded hits derived entirely from another guideline and those for which we could not identify detailed information on development. If different versions of the guidelines were available, only the latest edition was selected. Review of titles and abstracts was performed independently by two reviewers (LB and X-DZ). For a paper to be excluded, both reviewers had to agree that the article was ineligible. For abstracts, disagreements between the reviewers were discussed and resolved by consensus. The final selection based on the full text was performed by the first author.
Data extracted on the guideline level included the reported methodology for evidence synthesis, and formulating of recommendations. On the recommendation level, we extracted data on the consideration of cost effectiveness, the target population and the strategy for delivery of the test (see online supplementary table S2).
Supplementary file 2
AGREE II instrument
Each guideline was independently evaluated by reviewers according to the AGREE II instrument (table 1). The AGREE II is an international, rigorously developed and validated instrument consisting of 23 key items organised within six domains. Each item in a domain is scored from 1 (strongly disagree) to 7 (strongly agree). The score for each domain is obtained by the sum of all scores of the individual items in a domain and then standardised as follows: (obtained score − minimum possible score)/(maximum possible score − minimum possible score).
The maximum possible score for each item is 7, which indicates that the quality of reporting is exceptional and the guideline meets the full criteria and considerations articulated in the instrument. The minimum possible score for each item is 1 when there is no information about this item reported in the guideline. We initially conducted two rounds of pilot tests before assessing all of the included guidelines. After the above steps were performed, we provided an overall assessment of each set of guidelines. A guideline was ‘strongly recommended for use in practice’ if most domains (four or more) scored above 60%. A guideline was ‘recommended for use with some modification’ if most domains scored between 30% and 60%. ‘Not recommended for use in practice’ implied that most of the domains of the guideline were scored below 30%.
A table comparing the recommendations from the selected guidelines were constructed and AGREE II domain scores were calculated as means and categorical variables with the number of cases and corresponding percentages. Agreement between reviewers on AGREE II scores was assessed using interclass correlation coefficient. Given the limited number of guidelines, only explorative quantitative analyses were possible.
A total of 459 citations were collected during the preliminary literature selection process, though most were excluded after applying the inclusion and exclusion criteria. Ultimately, 16 guidelines were identified for further evaluation (figure 1). The 16 guidelines included in the present systematic review are from the following organisations:
WHO-2013,2 IADPSG-2015,21 FIGO-2015,17 ADA-2018,22 Endocrine Society (ES-2013),23 NIH-2013,19 ACOG-2018,24 The United States Preventive Services Task Force (USPSTF-2014),25 NICE-2015,26 German Diabetes Association (DDG-2014),27 European Board and College of Obstetrics and Gynecology (EBCOG-2015),28 Indian-2014,29 Queensland-2015, SOGC-2016,20 Hong Kong College of Obstetricians and Gynaecologists (HKCOG-2016) and Australasian Diabetes in Pregnancy Society (ADIPS-2014). The majority of the guidelines (8 of 16) were developed in the USA and Europe.
Reproducibility of the two reviewers’ average AGREE scores was good, with an intraclass correlation coefficient of 0.84. Results of the evaluation of the screening and diagnostic strategies for GDM performed using the AGREE II instrument are illustrated in table 2. The average AGREE II scores varied from 46% to 97%. Scores for each domain ranged as follows: scope and purpose domain from 61% to 100%; the stakeholder involvement domain from 33% to 90%; rigour of development domain from 33% to 96%, clarity of presentation domain from 64% to 100%, applicability domain from 58% to 100% and editorial independence domain from 8% to 100%.
Calculated domain scores are shown in figure 2. These radar maps illustrate the final scores for every guideline in each of the six domains, expressed as a percentage. Higher domain scores are mapped towards the periphery, and lower domain scores are plotted towards the centre. The graph allows one to visually gauge the relative strengths or weaknesses of each guideline by domain, in comparison to the other plotted guidelines (worldwide guidelines, North American guidelines, European guidelines and guidelines from Asia and Oceania).
Considering the CPGs overall, 9 of the 16 guidelines were described as ‘strongly recommended’, 7 as ‘recommended with modifications’ and 1 as ‘not recommended’. The highest ranked guideline was produced by the WHO, with the guideline produced by the NICE ranking second. Guidelines receiving lower ranking was mainly due to limited confidence in development methods, lack of evidence summaries or concerns about readability.
For women with a high risk for hyperglycaemia in pregnancy, there is a consensus among most of the guidelines that a screen for GDM should be conducted during the first trimester, or at the first prenatal visit. Risk factors for GDM listed in guidelines were shown in table 3; among them, overweight/obesity, previous history of GDM or macrosomia, and family history of DM were the most common risk factors. Most guidelines recommend the OGTT for early screening, although with different criteria. WHO, NIH, ES and Indian criteria did not provide a list of risk factors; one potential reason is that they recommend universal screening, rather than risk-based screening.
For low-risk pregnant women, the screening and diagnostic strategies for GDM are shown in table 4. There are mainly two kinds of methods for diagnosis: the first is the one-step strategy with 75 g glucose OGTT using the IADPSG criteria. This method is recommended in worldwide guidelines (WHO, IADPSG, FIGO), some American (ADA, ES), some European guidelines (DDG), some Asian guidelines (HKCOG) and the Oceanian guidelines. NICE and the guideline from India also use a one-step strategy but using different criteria.
The second method is the two-step strategy with 50 g GCT followed by a 75 g OGTT using C-C or NDDG criteria. This method is mainly recommended in some of the North American guidelines (ACOG, NIH and SOGC). USPSTF and EBCOG guidelines recommend both one-step and two-step methods with equal strength.
Regarding the timing of the OGTT, most guidelines recommend conducting the test between 24 and 28 gestational weeks. However, WHO guideline recommends conducting the OGTT any time, and FIGO guideline recommends 24–28 gestational weeks or any other time. USPSTF recommends conducting the OGTT after 24 gestational weeks. Indian guidelines did not specify the OGTT time.
As shown in table 4, guidelines that were strongly recommended share several advantages. Most of these guidelines recommend a one-step screening strategy with IADPSG criteria between 24 and 28 gestational weeks in low-risk pregnant women. Among seven strongly recommended guidelines, five list risk factors for GDM and four use Grading of Recommendations Assessment, Development and Evaluation (GRADE) criteria to select and evaluate evidence.
The present systematic review of clinical guidelines for the screening strategy of GDM included 16 guidelines published between 2013 and 2018.
To our knowledge, the present article is the first review of GDM guidelines that focus on the screening and diagnosis of GDM. In 2012, Greuter et al conducted a general evaluation of eight GDM guidelines but provided limited analysis on the screening and diagnosis of GDM.30 Moreover, many guidelines were updated after 2012, with the consideration of the HAPO study.
Overall, the quality of the guidelines is improving. The average AGREE II scores of the guidelines in the present study ranged from 46% to 97%, which is much higher than reported by Greuter et al. 30 Four guidelines evaluated in both reviews, namely the ACOG (77 vs 41), ADA (94 vs 37), NICE (96 vs 87) and ADIPS (55 vs 49) all received higher scores in the present study. Improvement of guideline quality is beneficial for clinical practice and comparison of different options.
The majority of the guidelines provided a clear description of ‘scope and purpose’ as screening, diagnosis or classification of hyperglycaemia during pregnancy, and some guidelines also contained management of GDM. Recommendations from most of the guidelines were clear and user friendly, with a variety of options for different populations and resources. For these reasons, most of the guidelines received high scores in the domains of scope and purpose, ‘clarity and presentation’ and ‘applicability’.
The difference of AGREE score among guidelines mainly arises from the domain of ‘rigour of development’. WHO, FIGO, NICE and ES guidelines development groups all used the GRADE methodology to assess the quality of evidence and format recommendation. Other guidelines built a consensus using procedures designed differently. For example, ADA developed ADA evidence-grading system to update the guideline, NIH applied Agency for Healthcare Research and Quality to evaluate literature. The guidelines of EBCOG, ADIPS, HKCOG and Queensland received a low score for rigour of development, since the guideline did not include a clear description of the process of guideline development or the evidence assessment method.
In addition to rigour of development, the quality of guidelines also varied in the fields of ‘stakeholder involvement’ and ‘editorial Independence’. The guidelines of IADPSG, WHO, ADA and NICE all provided clear and detailed records of the members and independence of guideline developing group. Other guidelines did not describe a full list of guideline board or the independence of the board. A lack of information in the field of rigour of development, stakeholder involvement and editorial independence may reduce the reliability of a guideline, even if the recommendation is similar or the same as the guidelines that received high scores in these domains.
More than half of the guidelines evaluated in the present review, including WHO, IADPSG, FIGO, ADA, ES, DDG, HKCOG, ADIPS and Queensland guideline, adopt the IADPSG criteria (table 4) for GDM screening. They recommend a one-step 75 g glucose OGTT most commonly between 24 and 28 gestational weeks, and a diagnosis of GDM is made if one of the values in the OGTT was equal to or exceeding 5.1–10.0–8.5 mmol/L.
This strategy is different from the methods applied in the old versions of these guidelines. The reason for this change is due to the result of the HAPO study regarding mild hyperglycaemia and adverse clinical outcomes, including large for gestational age (LGA), primary caesarean, clinical neonatal hypoglycaemia, and C peptide in umbilical cord blood.14 Therefore, a more strict strategy may help reduce the frequency of these potential complications.
However, the guidelines of NIH, ACOG and SOGC still recommend the two-step strategy and the C-C or NDDG criteria for the OGTT. Reasons provided from the guideline developing groups are: (1) the benefit from the treatment of mild GDM in women is not well established; (2) additional healthcare costs will be generated by the increased prevalence; (3) caesarean delivery and intensive newborn assessment will increase; and (4) life disruptions and psychosocial burdens will be developed in patients with GDM. Therefore, these guidelines still recommend a two-step approach with NDDG or C-C criteria.
The guidelines using IADPSG criteria are certain to have more pregnant women diagnosed with GDM than those using C-C or NDDG criteria.31 Debates between these guidelines include the cost and effectiveness, and the benefit versus harm of diagnosing mild hyperglycaemia. Therefore, further studies may focus on the comparison of multiple clinical outcomes and health cost of different strategies. In addition, it would be of great importance to study the screening strategy for specific populations and various resources.
The strength of the present review includes an integrated list of guidelines, and detailed insight into the screening and diagnosis strategies updated after 2012. The AGREE II criteria have been widely used to evaluate the quality of guidelines. The strength of AGREE is the constitution of six domains and 23 key items that represent all important aspects in guideline development and application.
Our study also has some limitations besides that imposed by AGREE II. First, the present review mainly focuses on guidelines that provided recommendations on the screening and diagnosis of GDM, and did not evaluate other fields, such as therapy, monitoring and obstetric consideration of GDM. This is because updates to the guidelines mainly focus on GDM screening and diagnosis following the HAPO study. Second, only guidelines and recommendations published in the English language were recruited in the present review. The majority of the worldwide guidelines were published using the English language, and thus multiple guidelines could be effectively compared in the review. The limitation of AGREE II is that the domains were not weighted by their importance in guideline development.32
The present study provides information on how to use guidelines in three aspects. First, this systematic review will help clinicians to understand the content of different guidelines quickly, and to recognise the quality of guidelines easily. Second, for pregnant women, this review help them to choose guidelines that were reliable and reader friendly. Since most clinicians and pregnant women only acknowledge to the guideline in their country or region, this review will help them to choose an adequate screening approach for individual. Third, for policy-makers and researchers, this review will provide an insight into the strength and limitation of each guideline, which will help to improve the quality of guidelines.
In summary, the quality of guidelines for screening and diagnosis of GDM has significantly improved since 2012, likely due to high-grade scientific-based evidence on this topic and the application of an evidence assessment system. The guidelines of WHO-2013, NICE-2015, ADA-2018, SOGC-2016, ES-2013, FIGO-2015, USPSTF-2014, IADPSG-2015 and ACOG-2018 are strongly recommended in the present evaluation by AGREE criteria. Debates focus on the utility of the one-step or two-step diagnosis method, and IADPSG or C-C/NDDG criteria in low-risk pregnant women. However, high-quality guidelines tend to recommend a universal screening by one-step 75 g OGTT strategy with IADPSG criteria between 24 and 28 gestational weeks, and the majority of these guidelines are likely to contain a list of high-risk factors and to select evidence by GRADE criteria. Further research that compares benefits and limitations, cost and effectiveness will help to resolve the current debates .
The authors thank Courtney Voss from the Western University, Canada, for critical review of the draft.
LL-z and XY contributed equally.
Contributors LL-z and XY: design the experiment; ZX-D: analyse the data; HS-b and WZ-l: write the article; DAS: collect literature; LB: design and revise the experiment.
Funding The study is funded by the National Natural Science Foundation of China (Nos. 81771602, 81300493 and 81701378) and the Sun Yat-Sen University Clinical Research 5010 Program (No. 2016014).
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement All research results were uploaded.
Patient consent for publication Obtained.