Objective To assess (1) how well validated existing paediatric track and trigger tools (PTTT) are for predicting adverse outcomes in hospitalised children, and (2) how effective broader paediatric early warning systems are at reducing adverse outcomes in hospitalised children.
Design Systematic review.
Data sources British Nursing Index, Cumulative Index of Nursing and Allied Health Literature, Cochrane Central Register of Controlled Trials, Database of Abstracts of Reviews of Effectiveness, EMBASE, Health Management Information Centre, Medline, Medline in Process, Scopus and Web of Knowledge searched through May 2018.
Eligibility criteria We included (1) papers reporting on the development or validation of a PTTT or (2) the implementation of a broader early warning system in paediatric units (age 0–18 years), where adverse outcome metrics were reported. Several study designs were considered.
Data extraction and synthesis Data extraction was conducted by two independent reviewers using template forms. Studies were quality assessed using a modified Downs and Black rating scale.
Results 36 validation studies and 30 effectiveness studies were included, with 27 unique PTTT identified. Validation studies were largely retrospective case-control studies or chart reviews, while effectiveness studies were predominantly uncontrolled before-after studies. Metrics of adverse outcomes varied considerably. Some PTTT demonstrated good diagnostic accuracy in retrospective case-control studies (primarily for predicting paediatric intensive care unit transfers), but positive predictive value was consistently low, suggesting potential for alarm fatigue. A small number of effectiveness studies reported significant decreases in mortality, arrests or code calls, but were limited by methodological concerns. Overall, there was limited evidence of paediatric early warning system interventions leading to reductions in deterioration.
Conclusion There are several fundamental methodological limitations in the PTTT literature, and the predominance of single-site studies carried out in specialist centres greatly limits generalisability. With limited evidence of effectiveness, calls to make PTTT mandatory across all paediatric units are not supported by the evidence base.
PROSPERO registration number CRD42015015326
- track and trigger scores
- early warning scores
- clinical deterioration
- systematic review
This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Strengths and limitations of this study
Paediatric early warning systems and paediatric track and trigger tools (PTTT) are increasingly used by paediatric units across Europe, North America, Australia and elsewhere—this study is a timely review of the evidence for their validity and effectiveness.
A comprehensive search was carried out across multiple databases and included published as well as grey literature.
The review highlights methodological weaknesses and gaps in the current evidence base and makes suggestions for future research.
Heterogeneity in study populations, study designs and outcome measures make it difficult to compare and synthesise findings across the wide range of early warning systems and PTTT being used in practice.
The review is limited in scope to quantitative validation and effectiveness studies, so must be considered alongside wider literature reflecting on potential secondary benefits of early warning systems and PTTT for communication, teamwork and empowerment.
Failure to recognise and respond to clinical deterioration in hospitalised children is a major safety concern in healthcare. The underlying causes of this problem are clearly multifactorial,1–3 but paediatric ‘early warning systems’ have been strongly advocated as one approach to improving recognition of deterioration in paediatric units.1 2 4
A paediatric ‘early warning system’ can be considered any patient safety initiative or programme which aims to monitor, detect and respond to signs of deterioration in hospitalised children in order to avert adverse outcomes and premature death. Such systems are often multifaceted and may include the use of rapid response teams (RRT) or medical emergency teams (MET), education or training to improve clinical staff’s ability to identify deterioration or strategies aimed at improving staff communication and situational awareness.
An increasingly commonplace paediatric ‘early warning system’ initiative is the use of a ‘track and trigger tool’: these tools, also commonly used in adult care, provide a formal framework for evaluating routine physiological, clinical and observational data for early indicators of patient deterioration. They are typically integrated into routine observation charts or electronic health records and compare patient observations with predefined ‘normal’ thresholds. When one or more observation is considered abnormal, staff are directed to various clinical actions, including but not limited to altered frequency of observations, review by senior staff or more appropriate treatment or management. Tools may be paper based or electronic and monitoring may be automated or manually undertaken by staff.
These tools have been referred to in the literature using a number of different terms: paediatric early warning scores (PEWS); paediatric early warning tools (PEWT), track and trigger tools (TTT) and many others. Here, we refer to the tools themselves using the term ‘paediatric track and trigger tools’ (PTTT). A variety of PTTT have been developed, typically by teams based in specialist paediatric centres and often used as a means of triggering a dedicated response team. Their advocacy has recently led to widespread uptake across a variety of different paediatric units, including many non-specialist centres where patient populations and resources may differ. In the UK, a recent cross-sectional survey found that 85% of paediatric units were using some form of PTTT, most of which were non-specialist centres without a dedicated response team.5 Despite their widespread use, recent reviews have questioned the evidence base for their effectiveness in improving patient outcomes.6 7 The current review aimed to build on this work, assessing in depth the evidence base for both the validity of PTTT for predicting in-patient deterioration and the effectiveness of broader ‘early warning systems’ at reducing instances of mortality and morbidity in paediatric settings:
Question 1: how well validated are existing PTTT and their component parts for predicting inpatient deterioration?
Question 2: how effective are paediatric early warning systems (with or without a PTTT) at reducing mortality and critical events?
This systematic review is reported in accordance with the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) guidelines.8 Our review protocol is registered with the PROSPERO database CRD42015015326.
A comprehensive search was conducted across a range of databases to identify relevant studies in the English language. Published and unpublished literature was considered where publicly available, as were studies in press. The following databases were searched through May 2018: British Nursing Index, Cumulative Index of Nursing and Allied Health Literature, Cochrane Central Register of Controlled Trials, Database of Abstracts of Reviews of Effectiveness, EMBASE, Health Management Information Centre, Medline, Medline in Process, Scopus and Web of Knowledge (Science Citation Indexes). To identify additional papers, published, unpublished or research reported in the grey literature, a range of relevant websites and trial registers were searched including Clinical Trials.gov. To identify published papers that had not yet been catalogued in the electronic databases, recent editions of key journals were hand-searched. The search terms included ‘early warning scores’, ‘alert criteria’, ‘rapid response’, ‘track and trigger’ and ‘early medical intervention’ (see online supplementary table 1).
Eligibility screening and study selection
PICOS parameters guided inclusion criteria for the validation and effectiveness studies (see online supplementary table 2). Papers reporting development of validation of a PTTT were included for question 1, whereas papers reporting the implementation of any broader ‘paediatric early warning system’ (with or without a PTTT) were eligible for question 2. Both research questions were limited to studies that involved inpatients aged 0–18 years. Outcome measures considered were mortality and critical events, including: unplanned admission to a higher level of care, cardiac arrest, respiratory arrest, medical emergencies requiring immediate assistance, children reviewed by paediatric intensive care unit (PICU) staff on the ward (in specialist centres) or reviewed by external PICU staff (for non-specialist centres), acuity at PICU admission and PICU outcomes. A range of study designs were considered for both questions.
Two of the review authors independently screened the titles and abstracts yielded in the search. Full texts were reviewed independently by six reviewers against the above eligibility criteria and were assigned to the relevant review question if included. Reasons for exclusion were recorded. Separate data extraction forms were developed for validation and effectiveness studies. The forms had common elements (study design, country, setting, study population, description of the PTTT or early warning system, statistical techniques used, outcomes assessed). Additional data items for validation studies included the items in the PTTT, modifications to the PTTT from previous versions, predictive ability of individual items and the overall tool, sensitivity and specificity and inter-rater and intra-rater reliability. Effectiveness studies included an assessment of outcomes in terms of mortality and various morbidity variables. Data extraction was carried out by two reviewers and discrepancies were resolved by discussion. For effectiveness studies, effect sizes and 95% CIs were calculated or reported as risk ratios (RR) or ORs as appropriate, with p values reported to assess statistical significance. Data analysis was conducted using an online medical statistics tool.
Patient and public involvement
This review was conducted as part of a larger mixed-methods study (ISRCTN94228292), which used a formal, facilitated parental advisory group. The group comprised parents of children who had experienced an unexpected adverse event in a paediatric unit and provided input which helped to shape the broader research questions and outcome measures. The results of the review will be disseminated to parents through this group.
Figure 1 shows the PRISMA flow diagram for both research questions.
Table 1 summarises the study characteristics of validation and effectiveness papers in the review.
Types of PTTS and components
Across 66 studies, we identified 27 unique PTTT (table 2). Twenty PTTTs were based on one of four different tools: Monaghan’s Brighton PEWS,10 the Bedside PEWS,11 the Bristol PEWT12 and the Melbourne Activation Criteria (MAC).13 Other PTTT described in the literature included the National Health Service Institute for Innovation and Improvement (NHS III) PEWS14 (the second most commonly used PTTT in UK paediatric settings5), RRT and MET activation criteria15–18 and one prediction algorithm developed from a large dataset of electronic health data.19
Table 2 illustrates the range of physiological and behavioural parameters underpinning PTTT. Common parameters included heart rate (present in 26 out of 27 PTTT), respiratory rate (24), respiratory effort (24) and level of consciousness or behavioural state (24). All PTTT required at least six different parameters to be collected.
Question 1: how well validated are PTTT and component parts for predicting inpatient deterioration?
Nine validation papers meeting inclusion criteria were excluded from analysis: eight did not report any performance characteristics of the PTTT for predicting deterioration20–27 and one study calculated incorrect sensitivity/specificity outcomes12 (see online supplementary table 4). The remaining 27 validation studies, evaluating the performance of 18 unique PTTT, are described in table 3. Four studies evaluated multiple PTTTs3 19 28 29 and one paper described three separate studies of the same PTTT.30
Five cohort studies were included,14 31–34 three based on the same dataset. All other studies were either case-control or chart reviews. Thirteen papers implemented the PTTT in practice,23 30 31 34–43 while the remaining studies ‘bench tested’ the PTTT—researchers retrospectively calculated the score based on data abstracted from medical charts and records. All studies were conducted in specialist centres with only one multicentre study reported.44
PTTT were evaluated for their ability to predict a wide range of clinical outcomes. Composite measures were used in 8 studies,14 23 29 32 33 37 45 46 cardiac/respiratory arrest or a ‘code call’ was used (singularly or part of a composite outcome) in 6 studies,23 28 29 37 45 47 while 22 studies used transfer to a to PICU or paediatric high-dependency unit as the main outcome.3 11 19 23 28–34 36 37 39 41–44 46 48 49
Predictive ability of individual PTTT components
Three validation papers reported on the performance characteristics of individual components of the tool for predicting adverse outcomes.11 33 42 Parshuram et al, for instance, reported area under the receiver operating characteristic curve (AUROC) values for individual PTTT items of a pilot version of the Bedside PEWS: ranging from 0.54 (bolus fluid) to 0.81 (heart rate), compared with 0.91 for the overall PTTT.11 All other studies reported outcomes for the PTTT as a whole.
The predictive ability of the 16-item PEWS score was assessed by one internal47 (AUROC=0.90) and two external case-control studies28 29 (AUROC range=0.82–0.88) with a range of outcome measures and scoring thresholds. One case-control study used an observed prevalence rate to calculate a positive predictive value (PPV) of 4.2% for the tool in predicting code calls47 (for every 1000 patients triggering the PTTT, 42 would be expected to deteriorate).
Bedside PEWS and derivatives
The Bedside PEWS was evaluated in one internal11 (AUROC=0.91) and five external case-control studies19 28 29 44 46 (AUROC range=0.73–0.90) for a range of different outcome measures and at different scoring thresholds. One case-control study calculated a PPV of 2.1% for identifying children requiring urgent PICU transfer within 24 hours of admission, based on locally observed prevalence rates.19 A modified version of the Bedside PEWS (with temperature added) demonstrated an AUROC of 0.86 in an external case-control study with a composite outcome of death, arrest or unplanned PICU transfer.29
Brighton PEWS and derivatives
Six different PTTT based on the original Brighton PEWS were evaluated across 11 studies.19 29 31 37 39–42 45 48 50 The Modified Brighton PEWS (a) was evaluated for its ability to predict PICU transfers in one large prospective cohort study (AUROC=0.92, PPV=5.8%),31 and an external case-control study tested the same score for predicting urgent PICU transfers within 24 hours of admission (AUROC=0.74, PPV=2.1%).19
An external case-control study used a composite measure of death, arrest or PICU transfer to evaluate the Modified Brighton PEWS (b) (AUROC=0.79) and the Modified Brighton PEWS (d) (AUROC=0.74).29 The latter tool was evaluated in a further internal case-control study for predicting PICU transfer (AUROC=0.82).48
The Children’s Hospital Early Warning Score (CHEWS) had a reported AUROC of 0.90 for predicting PICU transfers or arrests in a large internal case-control study.50 A modification for cardiac patients, the Cardiac CHEWS (C-CHEWS) was evaluated by one internal study on a cardiac unit37 (AUROC=0.90) looking at arrests or unplanned PICU transfers, and two external studies of oncology/haematology units41 42 for the same outcome (AUROC=0.95). Finally, the Children’s Hospital Los Angeles PEWS was evaluated by in a small internal case-control study for prediction of re-admission to PICU after initial PICU discharge40 (AUROC=0.71).
MAC and derivatives
The MAC was assessed by one external case-control study with an outcome of death, arrest or unplanned PICU transfer29 (AUROC=0.71) and a large external cohort study with an outcome of death or unplanned PICU or HDU transfer33 (AUROC=0.79, PPV=3.6%). A derivative of the MAC using an aggregate score, the Cardiff and Vale PEWS (C&VPEWS), was tested using the same cohort and outcome measures in an earlier external study (AUROC=0.86, PPV=5.9%)32 and was the best performing PTTT in an external case-control study evaluating multiple PTTT29 (AUROC=0.89).
The Bristol PEWT was evaluated by five external validation studies: two chart review studies3 35 (no AUROC), one small cohort study of PICU transfers34 (AUROC=0.91, PPV=11%), and two case-control studies looking at code calls28 (AUROC=0.75) and a composite of death, arrests and PICU transfers29 (AUROC=0.62).
The NHS III PEWS was tested by one external cohort study looking at a composite of death or unplanned transfers to PICU or HDU14 (AUROC=0.88, PPV=4.3%) and one external case-control study looking at a composite of death, arrests and PICU transfers29 (AUROC=0.82). Zhai et al developed and retrospectively evaluated a logistic regression algorithm in an internal case-control study looking at urgent PICU transfers in the first 24 hours of admission19 (AUROC=0.91, PPV=4.8%).
Across PTTT, studies reporting performance characteristics of a tool at a range of different scoring thresholds demonstrate the expected interaction and trade-off between sensitivity and specificity—at lower triggering thresholds, sensitivity is high but specificity is low; at higher thresholds, the opposite is true.
Inter-rater reliability and completeness of data
Accurate assessment of the ability of a PTTT to predict clinical deterioration is contingent on accuracy and reliability of tool scoring (whether by bedside nurses in practice or by researchers abstracting data) and the availability of underpinning observations. Only five papers made reference to accuracy or reliability of scoring,28 31 37 42 45 with mixed results: for example, two nurses separately scoring a subset of patients on the Modified Brighton PEWS (a) achieved an intra-class coefficient of 0.92,31 but a study nurse and bedside nurse achieved only 67% agreement in scoring the C-CHEWS tool.37 Completeness of data was reported in 11 studies.11 14 19 29 30 32 33 42 44 45 47 An evaluation of the Modified Bedside PEWS (a) reported that ‘the PEWS was correctly performed and could be used for inclusion in the study’ in 59% of cases,30 a prospective study bench-testing the C&VPEWS found an average completeness rate of 44% for the seven different parameters in daily practice,32 while a multicentre study of the Bedside PEWS reported that ‘only 5.1% (of observation sets) had measurements on all seven items'.44
Question 2: how effective are early warning systems at reducing mortality and critical events in hospitalised children?
Eleven papers meeting inclusion criteria were excluded from analysis for providing insufficient statistical information (eg, denominator data, absolute numbers of events) to calculate effect sizes.39 51–59 Further details on papers excluded from analysis are provided in online supplementary table 5. Findings from the 19 studies included in the analysis are summarised in table 4.
Type of early warning system interventions
Seventeen interventions involved the introduction of a new PTTT,13 15–18 60–72 one intervention introduced a mandatory triggering element to an existing PTTT71 and one study reported a large, multicentre analysis of MET introduction with no details on PTTT use.73 Twelve interventions included the introduction of a new MET or RRT,13 15–18 60–65 69 while four further interventions introduced a new PTTT in a hospital with an existing MET or RRT. Only three studies therefore evaluated a PTTT in the absence of a dedicated response team.67 68 70 A staff education programme was explicitly described in 10 interventions.13 15 17 61 62 64 67 68 70 72
Of the 18 studies that used a PTTT, only 7 used a tool that had been formally evaluated for validity: 3 used the Bedside PEWS,64 65 70 2 used the MAC,13 62 1 used the Modified Brighton PEWS (b)72 and 1 used the C-CHEWS.67 One study did not report the PTTT used,61 while 10 studies used a variety of calling criteria and local modifications to validated tools that had not been evaluated for validity.15–18 60 63 66 68 69 71
Mortality (ward or hospital wide)
Two uncontrolled before-after studies (both with MET/RRT) reported significant mortality rate reductions postintervention: one in hospital wide deaths per 100 discharges17 (RR=0.82, 95% CI 0.70 to 0.95) and one in total hospital deaths per 1000 admissions (RR=0.65, 95% CI 0.57 to 0.75) and deaths on the ward (‘unexpected deaths’) per 1000 admissions62 (RR=0.35, 95% CI 0.13 to 0.92). Seven studies found no reductions in mortality, including two high-quality multicentre studies.13 15 60 63–65 73 Parshuram et al conducted a cluster randomised trial and found no difference in all-cause hospital mortality rates between 10 hospitals randomly selected to receive an intervention centred around use of the Bedside PEWS and 11 usual care hospitals, 1-year postintervention (OR=1.01, 95% CI 0.61 to 1.69).64 Kutty et al 73 assessed the impact of MET implementation in 38 US paediatric hospitals with an interrupted time series study, and reported no difference in the slope of hospital mortality rates 5 years postintervention and the expected slope based on preimplementation trends (OR=0.94, 95% CI 0.93 to 0.95).
Two uncontrolled before-after studies (both with MET/RRT) reported a significant postintervention reduction in rates of PICU mortality among ward transfers (RR=0.31, 95% CI 0.13 to 0.72),18 and PICU mortality rates among patients readmitted within 48 hours (RR=0.43, 95% CI 0.17 to 0.99).63 Six studies (including a high-quality cluster randomised trial and interrupted time series study) reported no postintervention change in PICU mortality using a variety of metrics.64–69
Cardiac and respiratory arrests
Two uncontrolled before-after studies (both with RRT/MET) reported significant postintervention rate reductions in subcategories of cardiac arrests: one in ‘near cardiopulmonary arrests’63 (RR=0.54, 95% CI 0.52 to 0.57) but not ‘actual cardiopulmonary arrests’ and one in ‘preventable cardiac arrests’62 (RR=0.45, 95% CI 0.20 to 0.97) but not ‘unexpected cardiac arrests’. One uncontrolled before-after study (with RRT/MET) reported a significant postintervention reduction in rates of ward respiratory arrests per 1000 patient-days16 (RR=0.27, 95% CI 0.07 to 0.95). Seven studies (including one high-quality cluster randomised trial and one high-quality interrupted time series study) found no change in cardiac arrest rates using a variety of metrics13 15 16 61 64 65 or cardiac and respiratory arrests combined.60
Calls for urgent review/assistance
Two uncontrolled before-after studies (all with RRT/MET) reported significant postintervention reductions in rates of code calls17 63 (RR=0.29, 95% CI 0.10 to 0.65; RR=0.71, 95% CI 0.61 to 0.83) while three studies found no change in rates of code calls.15 18 72 One uncontrolled before-after study in a community hospital (without RRT/MET) found significant postintervention reductions in rates of urgent calls to the in-house paediatrician (RR=0.23, 95% CI 0.11 to 0.46) and respiratory therapist70 (RR=0.36, 95% CI 0.13 to 0.95). Two uncontrolled before-after studies (with RRT/MET) found increases in rates of RRT calls72 (RR=1.59, 95% CI 0.33 to 1.90) and outreach team calls66 (RR=1.92, 95% CI 1.79 to 2.07). One study found no change in rates of RRT calls.71
One uncontrolled before-after study (without RRT/MET) found a significant postintervention decrease in the rate of unplanned PICU transfers per 1000 patient-days67 (RR=0.70, 95% CI 0.56 to 0.88). Four studies (including one high-quality cluster randomised trial and one high-quality interrupted time series study) found no change in rates of PICU admissions postintervention.64–66 70
Two studies, one interrupted time series and one multicentre cluster randomised trial (both with RRT/MET), found significant reductions in rates of ‘critical deterioration events’ (life-sustaining interventions administered within 12 hours of PICU admission) relative to preimplementation trends and relative to control hospitals, respectively (IRR=0.38, 95% CI 0.20 to 0.75; OR=0.77, 95% CI 0.61 to 0.97).64 65 One controlled before-after study (without RRT/MET) reported a significant reduction in rates of invasive ventilation given to emergency PICU admissions postintervention (RR=0.83, 95% CI 0.72 to 0.97), with no significant change observed in a control group of patients admitted to PICU from outside of the hospital.68 One uncontrolled before-after study reported a significant postintervention decrease in rates of PICU admissions receiving mechanical ventilation (RR=0.85, 95% CI 0.73 to 0.99), but an increase in rates of early intubation (RR=1.87, 95% CI 1.33 to 2.62).69
Only three studies reported outcomes relating to the quality of implementation of the intervention. One study reported 99% of audited observation sets of the Bedside PEWS had at least five vital signs present postintervention, up from 76% preintervention (no change in control hospitals).64 A previous study of the same PTTT reported 3% of audited cases had used the incorrect age chart but reported an intraclass coefficient of 0.90 for agreement between bedside nurses scoring the PTTT in practice and research nurses retrospectively assigned scores.70 Finally, error rates in C-CHEWS scoring were reported to have reduced from an initial 47% to below 10% by the end of the study.67
This paper reviewed the published PTTT and early warning system literature in order to assess the validity of PTTT for predicting inpatient deterioration (question 1) and the effectiveness of early warning system interventions (with or without PTTT) for reducing mortality and morbidity outcomes in hospitalised children (question 2). We believe that the consideration of broader ‘early warning systems’ differentiates this paper from previous reviews, as does the inclusion of two recently published high-quality effectiveness studies.64 73
How well validated are existing tools for predicting inpatient deterioration?
Given a growing understanding and emphasis on the importance of local context in healthcare interventions, it is perhaps not surprising that such a wide range of PTTT have been developed and evaluated internationally, and modifications to existing PTTT are common. The result, however, is that a large number of different PTTT have been narrowly validated, but none has been broadly validated across a variety of different settings and populations. With only one exception,44 all studies evaluating the validity of PTTT have been single-centre reports from specialist units, greatly limiting the generalisability of the findings.
PTTT such as the Bedside PEWS, C&VPEWS, NHS III PEWS and C-CHEWS have demonstrated very good (AUROC ≥0.80) or excellent (AUROC ≥0.90) diagnostic accuracy, typically for predicting PICU transfers, in internal and external validation studies.11 14 19 29 32 37 42 44 However, methodological issues common to the validation studies mean that such results need to be interpreted with a degree of caution. First, each of the studies was conducted in a clinical setting where paediatric inpatients are subject to various forms of routine clinical intervention throughout their admission. There are numerous statistical modelling techniques which can account for co-occurrence of clinical interventions and the longitudinal nature of the predictors,74 75 but none of these were used in the validation studies and so estimates of predictive ability are likely to be distorted. Indeed, the majority of outcomes used in the validation studies are clinical interventions themselves (eg, PICU transfer). Second, while it understandable that a majority of studies ‘bench-tested’ the PTTT rather than implement it into practice before evaluation, the process of abstracting PTTT scores retrospectively from patient charts and medical records introduces a number of sources of potential bias or inaccuracy. For instance, several studies reported either high levels of missing data (ie, some of the observations required to populate the PTTT score being evaluated were not routinely collected or recorded and so were scored as ‘normal’)11 19 32 44 45 or difficulty in abstracting certain descriptive or subjective PTTT components.19 28 41 49 Assuming missing values are normal, or excluding some PTTT items for analysis are both likely to result in underscoring of the PTTT and skew the results. Finally, studies which evaluated a PTTT that had been implemented in practice are at risk of overestimating the ability of PTTT to predict proxy outcomes such as PICU transfer, inasmuch as high PTTT scores or triggers automatically direct staff towards escalation of care, or clinical actions which make escalation of care more likely.
The findings reported in several PTTT studies point towards two potential challenges for some centres in implementing and sustaining a PTTT in clinical practice. As noted above, a number of studies that retrospectively ‘bench-tested’ a PTTT reported that the observations that were required to score the tool were not always routinely collected or recorded in their centre. It may be that the introduction of a PTTT into practice would help create a framework to ensure that core vital signs and observations were collected more routinely (as demonstrated by Parshuram et al 64), but this would obviously have resource implications that could be a potential barrier for some centres. Such considerations are important, as evidence from the adult literature points to the potential for tools to inadvertently mask deterioration when core observations are missing.76 Second, PPV values reported in cohort studies, and case-control studies that adjusted for outcome prevalence, were uniformly low (between 2.3% and 5.9%).14 19 31–33 47 They demonstrate that even PTTT which demonstrate good predictive performance are likely to generate a large amount of ‘false alarms’ because adverse outcomes are so rare. For some centres, these issues may be mitigated to some extent by dedicated response teams or other available resources, but other hospitals may not be able to sustain the increased workload of responding to PTTT triggers.
How effective are early warning systems for reducing mortality and morbidity?
We found limited evidence for early warning system interventions reducing mortality or arrest rates in hospitalised children. While some effectiveness papers did report significant reductions in rates of mortality (on the ward or in PICU) or cardiac arrests after implementation of different early warning system interventions,16–18 62 63 they were all uncontrolled before-after studies which have inherent limitations in terms of establishing causality. They do not preclude the possibility that outcome rates would have improved over time regardless of the intervention77 or changes were caused by other factors, and their inclusion is accordingly discouraged by some Cochrane review groups.78 Three high-quality multicentre studies—two interrupted time series studies and a recent cluster randomised trial—found no changes in rates or trends of mortality or arrests postintervention.64 65 73
There was also limited evidence for early warning systems reducing PICU transfers or calls for urgent review. Again, a small number of uncontrolled before-after studies reported significant reductions postintervention,15 17 63 but several other studies reported significant increases in transfers or calls for review66 72 or no postintervention changes. We did find moderate evidence across four studies—including a controlled before-after study, a multicentre interrupted time series study and a multicentre cluster randomised trial—for early warning system interventions reducing rates of early critical interventions in children transferred to PICU.64 65 68 69 Such results are promising, but corresponding reductions in hospital or PICU mortality rates have not yet been reported.
Implementing complex interventions in a healthcare setting is challenging and evidence from the adult literature points to challenges and barriers to successfully implement TTT in practice.79–81 However, given so few effectiveness studies reported on implementation outcomes, it is difficult to know whether negative findings reflect poor effectiveness or implementation of early warning systems. Again, effectiveness studies were predominantly carried out in specialist centres—and in all but three cases,67 68 70 involved the use of a dedicated response team—which greatly limits the generalisability of findings outside of these contexts.
Limitations of the review
There are several limitations of the current review. First, despite purposely widening the scope of the effectiveness review question to include paediatric ‘early warning systems’ with or without a PTTT, we identified very few studies that did not employ a PTTT as part of the intervention. In part, this likely reflects the fact that PTTT have become almost synonymous with early warning systems, but it is also possible that our search strategy may have missed some broader early warning system initiatives that were not explicitly labelled as such. Second, our inclusion criteria for study selection were deliberately broad and so resulted in our including several validation and effectiveness studies that were subsequently excluded from analysis due to insufficient statistical detail or methodological issues. Third, the scope of the current review was limited to consideration of quantitative validation and effectiveness studies. We are mindful of research suggesting that implementing PTTT in practice may confer secondary benefits including, but not limited to improvements in communication, teamwork and empowerment of junior staff to call for assistance.82–84 Finally, we opted not to conduct a meta-analysis of effectiveness findings due to the heterogeneity of outcome metrics, interventions and study designs, populations and settings. Given the large sample sizes required to detect changes in rare adverse events, we believe further work is needed to harmonise outcome measures used to evaluate early warning system interventions internationally, in order to facilitate pooling of findings across studies.
The PTTT literature is currently characterised by an ‘absence of evidence’ rather than an ‘evidence of absence’. PTTT seem like a logical tool for helping staff detect and respond to deteriorating patients, but the existing evidence base is too limited to form clear judgements of their utility. We would argue that there has been too much confidence placed in the statistical findings of validation studies of PTTT, given methodological limitations in the study designs. There is evidence of consistently high false-alarm rates and bench-testing studies point to many PTTT parameters not being reliably recorded in practice: as such there is reason for caution in considering the viability of PTTT for all hospitals. Almost all of the early warning systems and PTTT reported in the literature have been developed and evaluated in specialist centres, typically in units with access to dedicated response teams—yet PTTT appear to be commonly adopted by non-specialist units with little modification. There is currently limited evidence that ‘early warning systems’ incorporating a PTTT reduce deterioration or death in practice. As such, we would urge caution among policymakers in calling for their use to become mandatory across all hospitals. We acknowledge the potential for PTTT to confer a range of secondary benefits in areas such as communication, teamwork and empowerment of junior staff. More work is required to understand the wider impact of PTTT implementation in different clinical settings before it is possible to evaluate their overall contribution to the wider safety mechanisms and systems aimed at identifying and responding to deteriorating in paediatric patients.
The authors would like to acknowledge the contribution of Dr James Bunn to the review.
Contributors RT: screening and review of papers, contribution to design of work, preparation of manuscript; CH: screening and review of papers, contribution to concept and design of work, review of manuscript; FVL-W: contribution to design of work, screening and review of papers, review of manuscript; KH: contribution to concept and design of work, screening and review of papers, review of manuscript; CP, DR, BM, AO, DE, RS, GS, DL, LNT, DA, AL, ET-J: contribution to concept and design of work, screening and review of papers, review of manuscript; MM: information specialist, review of manuscript.
Funding This study is funded by the National Institute for Health Research (NIHR) Health Services and Delivery Research (HS&DR) programme (12/178/17).
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement All data relevant to the study are included in the article or uploaded as supplementary information.
Patient consent for publication Not required.