Article Text

Original research
Accuracy of continuous glucose monitoring in preterm infants: a systematic review and meta-analysis
  1. Chiara Nava1,
  2. Astrid Modiano Hedenmalm2,
  3. Franciszek Borys3,
  4. Lotty Hooft4,
  5. Matteo Bruschettini5,
  6. Kevin Jenniskens4,6
  1. 1Neonatal Intensive Care Unit, Ospedale Alessandro Manzoni, Lecco, Lecco, Italy
  2. 2Skåne University Hospital Lund, Lund, Skåne, Sweden
  3. 3Poznan University of Medical Sciences, Poznan, Wielkopolskie, Poland
  4. 4Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
  5. 5Department of Clinical Sciences Lund, Paediatrics; Cochrane Sweden, Research and Development, Lund University, Skane University Hospital, Lund, Sweden, Lund, Sweden
  6. 6Julius Center for Health Sciences and Primary Care, Utrecht, The Netherlands
  1. Correspondence to Dr Matteo Bruschettini; matteo.bruschettini{at}


Background and objectives Continuous glucose monitoring (CGM) could be a valuable instrument for measurement of glucose concentration in preterm neonate. We undertook a systematic review and meta-analysis to compare the diagnostic accuracy of CGM devices to intermittent blood glucose evaluation methods for the detection of hypoglycaemic or hypoglycaemic events in preterm infants.

Data sources A structured electronic database search was performed for studies that assessed the accuracy of CGM against any intermittent blood glucose testing methods in detecting episodes of altered glycaemia in preterm infants. No restrictions were used. Three review authors screened records and included studies.

Data extraction Risk of bias was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 tool. From individual patient data (IPD), sensitivity and specificity were determined using predefined thresholds. The mean absolute relative difference (MARD) of the studied CGM devices was assessed and if those satisfied the accuracy requirements (EN ISO 15197). IPD datasets were meta-analysed using a logistic mixed-effects model. A bivariate model was used to estimate the summary receiver operating characteristic curve (ROC) curve and extract the area under the curve (AUC). The overall level of certainty of the evidence was assessed using Grading of Recommendations Assessment, Development and Evaluation.

Results Among 4481 records, 11 were included. IPD datasets were obtained for five studies. Only two of the studies showed an MARD lower than 10%, with none of the five CGM devices studied satisfying the European Union (EU) ISO 15197 requirements. Pooled sensitivity and specificity of CGM devices for hypoglycaemia were 0.39 and 0.99, whereas for hyperglycaemia were 0.87 and 0.99, respectively. The AUC was 0.70 and 0.86, respectively. The certainty of the evidence was considered as low to moderate. Limitations primarily related to the lack of representative population, reference standard and CGM device.

Conclusions CGM devices demonstrated low sensitivity for detecting hypoglycaemia in preterm infants, however, provided high accuracy for detection of hyperglycaemia.

PROSPERO registration number CRD42020152248.

  • neonatology
  • paediatrics
  • biotechnology & bioinformatics
  • neonatal intensive & critical care
  • paediatric endocrinology

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Strengths and limitations of this study

  • Wide and systematic literature search was conducted, with no restrictions based on language, year of publication or publication status.

  • Authors of the studies eligible for inclusion were contacted and full stratified data sets including a total of 1706 and 1339 paired values were collected.

  • Unbiased estimates and CIs were obtained by using logistic mixed-effects models.

  • Accuracy of the instrument was analysed by applying different methods of analysis (sensitivity and specificity, mean absolute relative difference, ISO).

  • Limitations primarily relate to the overall lack of representative study population, reference standard and continuous glucose monitoring device itself.


Preterm infants are at risk of impaired glucose control, especially during the first weeks of life. They are prone to hypoglycaemic episodes,1 2 but they also tend to become hyperglycaemic due to insulin resistance.3–5 Impaired glucose control in neonates is associated with poor neurodevelopmental outcomes, persistent brain injury,6–8 retinopathy, sepsis, intraventricular haemorrhage and death.3 9–11

Standard of care glucose monitoring guidelines for infants typically include testing the glucose level12 either through laboratory tests or by using point of care (PoC) glucometers. Intermittent glucose testing modalities result in an underestimation of the number of events and higher number of painful heel pricks, with potential detrimental consequences,1 13 more blood draw, with loss of important blood growth factors14 and potentially anaemia.15

Continuous glucose monitoring (CGM) devices are an alternative to PoC glucometers. They measure the concentration of glucose in the interstitial fluid every 10 s (online supplemental eFigure 1), providing a mean value every 5 min. They have already been well established for children and adults with diabetes,16 but they have become more suitable also to newborns.17 They can minimise the number of blood draws, blood loss and number of undetected events.

This systematic review aims to assess all available evidence on the accuracy of CGM devices in detecting abnormalities of glycaemic control in preterm infants, compared with the intermittent glucose testing modalities that are currently used for this population.


A systematic review and meta-analysis were performed using Cochrane methodology.18


A search strategy was defined and conducted for all relevant published or unpublished material (see Search strategy) on the Cochrane Central Register of Controlled Trials in the Cochrane Library, on PubMed (1996 to September 2019), Embase (1980 to September 2019) and CINAHL (1982 to September 2019). There were no restrictions based on language, year of publication or publication status.

Titles and abstracts were screened for eligibility by two review authors, full texts of the potentially eligible articles were assessed for eligibility. Disagreements were solved through discussion and if necessary two other reviewers were consulted. The Preferred Reporting Items for Systematic Reviews and Meta-analyses was used to present the results.19

Criteria for inclusion/exclusion

Prospective and retrospective cross-sectional cohort studies were eligible for inclusion in the review, excluding randomised controlled trials (RCTs), case–control studies and case reports. Studies were included if they evaluated the diagnostic performance of CGM devices to intermittent blood glucose testing by PoC devices. Studies were included if 80% or more of the patients were prematurely born infants (gestational age at birth <37 weeks) admitted to neonatal intensive care units (NICUs) or nurseries. If the percentage was lower or unknown, the authors of these studies were contacted with a request for stratified data, and they would only be included if stratified data was provided. No restrictions were put on birth weight, postnatal age and whether or not they had received prior treatment for hypoglycaemia or hyperglycaemia. Studies were excluded if it was not possible to extract accuracy estimates, either directly from the article or from information provided by the authors.

Data extraction

Data on study information, study participants, thresholds used for hypoglycaemia and hyperglycaemia, reference standard were extracted from the included studies. The characteristics of the index test and any adverse events were also extracted. Lastly, if possible, we extracted information on the number of true positive (TP), false positive (FP), true negative (TN) and false negative (FN) test results, or these were derived from sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) estimates. Moreover, requests were sent to the authors for either their complete anonymised data sets or aggregate measures when needed.

Study quality assessment

Two review authors independently assessed the risk of bias for each study using the Quality Assessment of Diagnostic Accuracy Studies-2 tool.20

The overall level of certainty of the evidence was assessed using the Grading of Recommendations Assessment, Development and Evaluation approach (GRADE)21 for each outcome (ie, TP/FN (SE) and TN/FP (Sp) for both hypoglycaemia and hyperglycaemia).

We used 3×3 contingency tables with the reference standard against the index test, each side with hypoglycaemia, normoglycaemia and hyperglycaemia (online supplemental eTable 1).

Statistical analyses

To get unbiased estimates and confidence intervals for diagnostic test accuracy parameters, individual patient data (IPD) was requested from included studies and the most common thresholds in studies were taken for calculating sensitivity and specificity estimates in meta-analysis. Logistic mixed-effects models using random intercepts for patient and study level covariates were used to estimate the association between the CGM test result and presence or absence of hyperglycaemia or hypoglycaemia. Profile CIs were calculated for individual studies and pooled estimates. A bivariate model was used to estimate the summary receiver operating characteristic curve (sROC) curve and extract the partial area under the curve (AUC) statistic. An AUC of >0.9 was regarded as high diagnostic accuracy, 0.7 to 0.9 as moderate and 0.5 –0.7 as low.22

Studies that did not provide IPD were excluded from meta-analysis.

IPD was also used for calculation of the mean absolute relative difference (MARD) of the CGM compared with the reference standard. We also applied the minimum accuracy requirements (EN ISO 15197:2015)23 commonly used for assessing performance of intermittent PoC devices. Difference plots were made to show the range of differences between CGM and reference standard measurements.

Patient and public involvement

Patients or the public were not involved in the design, or conduct, or reporting, or dissemination plans of our research.



The search generated 4481 records and, after removal of duplicates and irrelevant studies, 40 studies were eligible for full-text assessment (online supplemental eFigure 2).

Twenty-five of these studies were excluded because they did not meet the eligibility criteria (online supplemental eTable 2).

Fifteen articles were included24–38 as 14 studies (two records reported on the same population.34 36 Two of these studies37 38 had a mixed population consisting of both term and preterm infants. Three of the included studies reported about microdialysis for detection of glucose concentration in the interstitial fluid,25 26 34 36 which is a technique that precedes the CGM devices of today. They have been viewed qualitatively, but not included in the quantitative analysis and risk of bias assessment (see online supplemental file).

Characteristics of included studies

Eleven studies were included24 27–33 35 37 38 (table 1 and online supplemental eTable 3). There was a total of 415 study subjects in these studies, and 8947 paired (ie, CGM and reference standard) glucose values. The majority of studies recruited patients within their first few days of life and studied them for a few consecutive days. One looked at patients with a mean age of 65 days29 and one at patients between 5 and 140 days of age.33 Thresholds most commonly used in included studies were <2.5 and <2.6 mmol/L (range 2.2–2.8 mmol/L) for hypoglycaemia, >10 mmol/L for hyperglycaemia (range 6.7–10 mmol/L). All studies used CGM devices as their index test but calibrated with different frequencies. The minimum number of calibrations ranged from 2 to 4 per day. One study35 did not state how often the CGM device was calibrated. Almost all studies used PoC glucometers as reference standard, except one30 which used an arterial line to take blood samples for laboratory glucose analysis. The use of CGM devices in preterm infants appeared to be safe, with only few reports on adverse events such as infection, oedema or problems with the skin at the site of insertion.27 35 39

Table 1

Characteristics of included studies

None of the 11 included studies clearly reported TP, FP, TN and FN, therefore, the authors of all studies were contacted with requests for additional information. Five authors27 29 33 37 38 replied providing IPD, hence these were included in the meta-analysis. All studies included patients who experienced hypoglycaemic events, however, only two studies had patients with hyperglycaemic events.27 33 Data from three studies33 37 38 were stratified to include only preterm patients.

Beardsall et al24 reported estimates from which it was possible to back-calculate TP, TN, FP and FN, whereas Perri et al28 provided insufficient information for these to be determined.

Meta-analysis and investigations of heterogeneity

Based on common threshold used by included studies, hypoglycaemia and hyperglycaemia were defined in our meta-analysis as blood glucose values <2.6 mmol/L and >10 mmol/L.

MARD was calculated for the five studies we received IPDs for.27 29 33 37 38 Only two studies27 29 showed an MARD lower than 10%; those were the studies with the highest number of paired values. None of the five CGM devices studied satisfied the European Union (EU) ISO 1519723 requirements when applied (figure 1).

Figure 1

Mean absolute relative difference (MARD) for included studies that provided individual patient data. CGM, continuous glucose monitoring; PoC, point of care.

Meta-analysis results are presented in forest plots (figure 2) and sROC curves (figure 3). The pooled sensitivity of CGM for detecting hypoglycaemia was 0.39 (95% CI 0.12 to 0.74), with a pooled specificity of 0.99 (0.99 to 1.00). Pooled sensitivity for detection of hyperglycaemia by CGM markedly higher at 0.87 (95% CI 0.81 to 0.92), with a pooled specificity of 0.99 (0.99 to 0.99).

Figure 2

(A, B) Forest plots of sensitivity and specificity of CGM for diagnosis of hypoglycaemia and hyperglycaemia in preterm infants. The figure shows the estimated sensitivity and specificity of the study (blue square) and its 95% CI (black horizontal line). Studies are ordered by alphabetical order. (A) Hypoglycaemia (B) hyperglycaemia. CGM, continuous glucose monitoring; FN, false negative; FP, false positive; TN, true negative; TP, true positive.

Figure 3

(A, B) Summary ROC curves and AUC of CGM for diagnosis of hypoglycaemia and hyperglycaemia in preterm infants using a threshold <2.6 mmol/L for hypoglycaemia (n=5 studies) and >10 mmol/L for hyperglycaemia (n=2 studies). AUC, area under the curve; CGM, continuous glucose monitoring; sROC, summary ROC.

More heterogeneity between studies was observed in the hypoglycaemic as opposed to the hyperglycaemic events, due to the limited number of events within studies. For the same reason the confidence region of the ellipse of the sROC curve for hypoglycaemia was wider compared with that of hyperglycaemia. The AUC was 0.70 and 0.86 for hypoglycaemia and hyperglycaemia, respectively.

Certainty of the evidence

GRADE was performed separately for both Se (TP/FN) and SE (FP/TN) accuracy measures for the two outcomes (table 2). Three of the five studies24 27 29 on hypoglycaemia and one of the two studies on hyperglycaemia27 had a serious risk of bias (figure 4 and online supplement) due to questionable reference standard, serious enough to downgrade the certainty of the level of evidence for both SE and Sp measures for the two outcomes. In addition, there was uncertainty about imprecision (figure 2), the CI for the sensitivity estimate for hypoglycaemia were wide and these points were considered serious enough to downgrade of one point the level of evidence for SE estimate of this outcome. Overall, the evidence on whether CGM devices can be used to diagnose hypoglycaemia in preterm infants was considered as low and moderate for sensitivity and specificity estimates respectively. For hyperglycaemia, the overall certainty of the evidence was graded as moderate for both the sensitivity and specificity estimates.

Table 2

Summary of findings tables for each outcome

Figure 4

(A, B) risk of bias (A) Risk of bias summary: review authors’ judgements about the risk of bias and applicability concerns for each included study (B) methodological quality graph: review authors’ judgements about the risk of bias and applicability concerns in all included studies.


Summary of main findings

The present review is the first to systematically appraise the accuracy of CGM devices in preterm infants. In defining the inclusion criteria, we excluded the case–control studies, as long as they tend to an overestimation of the test accuracy due to a lack of representative patient population (diseased group vs healthy group). Moreover, we excluded RCTs as long as this design is more appropriate for evaluating the effect on the outcomes and not for estimating the accuracy. We included 11 studies with a total of 415 infants enrolled in NICUs or nurseries in different countries. Overall quality of included was moderate with few applicability concerns. In studies included in the meta-analysis, sensitivity of CGM was low for hypoglycaemia, implying that this device is not sufficiently reliable to detect cases of hypoglycaemia. On the contrary, both sensitivity and specificity of CGM devices to detect hyperglycaemia were high in included studies.

A currently ongoing Cochrane systematic review is assessing benefits and harms of use of CGM devices in preterm infants, however, there is no focus on diagnostic accuracy.40 Furthermore, McKinlay et al41 concluded that several technological issues need to be addressed before CGM devices can be recommended for glucose monitoring at NICUs.

The two measurement systems we analysed in this review (CGM and PoC), even if sharing the same scope, have completely different functionalities and thus it is difficult to compare them.42 For this reason, the accuracy of two devices studied27 29 seems to be good (ie, MARD <10%), whereas they do not meet the ISO requirements. This demonstrated how the MARD itself should not be trusted as a stand-alone accuracy index.

Overall, our meta-analysis showed that CGM is a diagnostic tool of low to moderate accuracy in detecting low glucose values, moderate to high accuracy in detecting hyperglycaemic episodes.

Strengths and limitations of the review

A main strength of this review is that we used rigorous Cochrane methods to conduct a wide and systematic literature search, to reduce reviewer error and bias. We contacted authors of the studies eligible for inclusion and collected full stratified data sets including a total of 1706 and 1339 paired values: this allowed us to calculate TP, FP, TN and FN values at a single self-defined threshold, thereby eliminating different thresholds as a potential source of heterogeneity. Moreover, we were able to include in the quantitative analysis those studies with a population of both preterm and term infant, after selecting only paired values coming from the preterm subpopulation.33 37 38 We were able to get unbiased estimates and confidence intervals by using logistic mixed effects models for hypoglycaemia and hyperglycaemia, taking into repeated measures for each study participant. Lastly, we analysed the accuracy of the instrument by applying different methods of analysis (sensitivity and specificity, MARD, ISO).

To appreciate the findings in this systematic review there are some limitations which may be considered. First, only five studies could be included in the meta-analysis as the authors provided IPD, which cover only a small fraction of the total body evidence (28% of the patients and 20% of the paired values). Moreover, the majority of the paired values included in this way come from two trials with high risk of bias for patient selection.27 29 Second, it is important to consider that the large sample size in this review consists of a large number of paired values taken from a small number of patients. In studies with a low prevalence of hypoglycaemia or hyperglycaemia, all of the altered glucose values might be from a small portion of the study subjects. For these reasons, the pooled estimates have a considerable risk of being biased because of selection bias and because of lower quality of studies.

Moreover, we advise against overinterpretation of these results due to the overall lack of representative study population, reference standard (different brand, different accuracy) and, most of all, CGM device itself. In fact, the CGM is an instrument which is still under development and for this reason there is still high heterogeneity across studies in CGM brands, calibration protocols, systems for analysing data.

For future studies on this subject, a reference standard with a high level of accuracy should be used. Additionally, blinding the CGM device, if possible, should be implemented. A more thorough reporting of results should be provided to facilitate inclusion in future reviews on this subject. MARD alone cannot be used as a stand-alone measure to evaluate the accuracy of a CGM device, but it could be useful instead to associate other types of analysis as the accuracy limits of ISO or bias or the Error Grid analysis. Lastly, more attention should be given to calibration procedures in research protocols, as long as the level of glycaemia of a premature infant is usually not suitable for calibration procedure (<70 mg/dL).

Applicability of findings

The GRADE approach was used to evaluate the certainty evidence for both accuracy outcomes. For hypoglycaemia the confidence in the sensitivity estimate was low and moderate in the specificity estimate. This means that further research is very likely to have an important impact on our confidence in these estimates and that it is likely to change the estimates.21 For hyperglycaemia, both the sensitivity and specificity estimates were graded as having a moderate level of certainty.


In conclusion, this review shows that the sensitivity for CGM devices to diagnose hypoglycaemia in preterm infants is poor, but the specificity is high, whereas both the sensitivity and specificity for CGM devices to diagnose hyperglycaemia are high. This suggests that the CGM devices should not be trusted to correctly identify all hypoglycaemic episodes, unless used as a supplement to intermittent methods of glucose monitoring. The evidence was graded as low to moderate, meaning that future research might have an impact on the level of confidence in the results. The use of CGM devices could allow to study the trend of blood glucose in infants. This is of great importance, considering that there is still no consensus regarding the thresholds for defining hypoglycaemia and hyperglycaemia in newborns,43 nor for the screening and management of those infants.35 44 45


We thank Matthias Bank (library and ICT services, Lund University) for defining and running the search strategy for articles to include in the screening. We thank Dr Iglesias-Platas, Dr Pertierra-Cortada, Dr Nally, Dr Tabery, Dr Wackernagel and Dr Tomotaki for providing additional information about their studies. We thank Miss B. Brambilla for drafting the CGM device figure.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Twitter @MBruschettini

  • Contributors CN, MB and LH conceived the study and study design. MB drafted the initial search strategy and executed the search. CN, AMH and FB screened the studies. CN, AMH and FB extracted data extraction and performed the quality assessment. CN, AMH and KJ analysed the data. AMH and CN wrote the manuscript. MB is the guarantor. All authors interpreted the data and wrote and critically reviewed the manuscript and all revisions. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

  • Funding This study was supported by governmental ALF research grants to Lund University and Lund University Hospital (grant 2019-YF0035).

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Data availability statement Data are available on reasonable request. All data relevant to the study are included in the article or uploaded as online supplemental information. Data can be requested from the corresponding author.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.