Validity and effectiveness of paediatric early warning systems and track and trigger tools for identifying and reducing clinical deterioration in hospitalised children: a systematic review

Objective To assess (1) how well validated existing paediatric track and trigger tools (PTTT) are for predicting adverse outcomes in hospitalised children, and (2) how effective broader paediatric early warning systems are at reducing adverse outcomes in hospitalised children. Design Systematic review. Data sources British Nursing Index, Cumulative Index of Nursing and Allied Health Literature, Cochrane Central Register of Controlled Trials, Database of Abstracts of Reviews of Effectiveness, EMBASE, Health Management Information Centre, Medline, Medline in Process, Scopus and Web of Knowledge searched through May 2018. Eligibility criteria We included (1) papers reporting on the development or validation of a PTTT or (2) the implementation of a broader early warning system in paediatric units (age 0–18 years), where adverse outcome metrics were reported. Several study designs were considered. Data extraction and synthesis Data extraction was conducted by two independent reviewers using template forms. Studies were quality assessed using a modified Downs and Black rating scale. Results 36 validation studies and 30 effectiveness studies were included, with 27 unique PTTT identified. Validation studies were largely retrospective case-control studies or chart reviews, while effectiveness studies were predominantly uncontrolled before-after studies. Metrics of adverse outcomes varied considerably. Some PTTT demonstrated good diagnostic accuracy in retrospective case-control studies (primarily for predicting paediatric intensive care unit transfers), but positive predictive value was consistently low, suggesting potential for alarm fatigue. A small number of effectiveness studies reported significant decreases in mortality, arrests or code calls, but were limited by methodological concerns. Overall, there was limited evidence of paediatric early warning system interventions leading to reductions in deterioration. Conclusion There are several fundamental methodological limitations in the PTTT literature, and the predominance of single-site studies carried out in specialist centres greatly limits generalisability. With limited evidence of effectiveness, calls to make PTTT mandatory across all paediatric units are not supported by the evidence base. PROSPERO registration number CRD42015015326


AbstrACt
Objective To assess (1) how well validated existing paediatric track and trigger tools (PTTT) are for predicting adverse outcomes in hospitalised children, and (2) how effective broader paediatric early warning systems are at reducing adverse outcomes in hospitalised children. Design Systematic review. Data sources British Nursing Index, Cumulative Index of Nursing and Allied Health Literature, Cochrane Central Register of Controlled Trials, Database of Abstracts of Reviews of Effectiveness, EMBASE, Health Management Information Centre, Medline, Medline in Process, Scopus and Web of Knowledge searched through May 2018. Eligibility criteria We included (1) papers reporting on the development or validation of a PTTT or (2) the implementation of a broader early warning system in paediatric units (age 0-18 years), where adverse outcome metrics were reported. Several study designs were considered. Data extraction and synthesis Data extraction was conducted by two independent reviewers using template forms. Studies were quality assessed using a modified Downs and Black rating scale. results 36 validation studies and 30 effectiveness studies were included, with 27 unique PTTT identified. Validation studies were largely retrospective case-control studies or chart reviews, while effectiveness studies were predominantly uncontrolled before-after studies. Metrics of adverse outcomes varied considerably. Some PTTT demonstrated good diagnostic accuracy in retrospective case-control studies (primarily for predicting paediatric intensive care unit transfers), but positive predictive value was consistently low, suggesting potential for alarm fatigue. A small number of effectiveness studies reported significant decreases in mortality, arrests or code calls, but were limited by methodological concerns. Overall, there was limited evidence of paediatric early warning system interventions leading to reductions in deterioration. Conclusion There are several fundamental methodological limitations in the PTTT literature, and the predominance of single-site studies carried out in specialist centres greatly limits generalisability. With limited evidence of effectiveness, calls to make PTTT mandatory across all paediatric units are not supported by the evidence base. PrOsPErO registration number CRD42015015326 bACkgrOunD Failure to recognise and respond to clinical deterioration in hospitalised children is a major safety concern in healthcare. The underlying causes of this problem are strengths and limitations of this study ► Paediatric early warning systems and paediatric track and trigger tools (PTTT) are increasingly used by paediatric units across Europe, North America, Australia and elsewhere-this study is a timely review of the evidence for their validity and effectiveness. ► A comprehensive search was carried out across multiple databases and included published as well as grey literature. ► The review highlights methodological weaknesses and gaps in the current evidence base and makes suggestions for future research. ► Heterogeneity in study populations, study designs and outcome measures make it difficult to compare and synthesise findings across the wide range of early warning systems and PTTT being used in practice. ► The review is limited in scope to quantitative validation and effectiveness studies, so must be considered alongside wider literature reflecting on potential secondary benefits of early warning systems and PTTT for communication, teamwork and empowerment.
Open access clearly multifactorial, 1-3 but paediatric 'early warning systems' have been strongly advocated as one approach to improving recognition of deterioration in paediatric units. 1 2 4 A paediatric 'early warning system' can be considered any patient safety initiative or programme which aims to monitor, detect and respond to signs of deterioration in hospitalised children in order to avert adverse outcomes and premature death. Such systems are often multifaceted and may include the use of rapid response teams (RRT) or medical emergency teams (MET), education or training to improve clinical staff's ability to identify deterioration or strategies aimed at improving staff communication and situational awareness.
An increasingly commonplace paediatric 'early warning system' initiative is the use of a 'track and trigger tool': these tools, also commonly used in adult care, provide a formal framework for evaluating routine physiological, clinical and observational data for early indicators of patient deterioration. They are typically integrated into routine observation charts or electronic health records and compare patient observations with predefined 'normal' thresholds. When one or more observation is considered abnormal, staff are directed to various clinical actions, including but not limited to altered frequency of observations, review by senior staff or more appropriate treatment or management. Tools may be paper based or electronic and monitoring may be automated or manually undertaken by staff.
These tools have been referred to in the literature using a number of different terms: paediatric early warning scores (PEWS); paediatric early warning tools (PEWT), track and trigger tools (TTT) and many others. Here, we refer to the tools themselves using the term 'paediatric track and trigger tools' (PTTT). A variety of PTTT have been developed, typically by teams based in specialist paediatric centres and often used as a means of triggering a dedicated response team. Their advocacy has recently led to widespread uptake across a variety of different paediatric units, including many non-specialist centres where patient populations and resources may differ. In the UK, a recent cross-sectional survey found that 85% of paediatric units were using some form of PTTT, most of which were non-specialist centres without a dedicated response team. 5 Despite their widespread use, recent reviews have questioned the evidence base for their effectiveness in improving patient outcomes. 6 7 The current review aimed to build on this work, assessing in depth the evidence base for both the validity of PTTT for predicting in-patient deterioration and the effectiveness of broader 'early warning systems' at reducing instances of mortality and morbidity in paediatric settings: ► Question 1: how well validated are existing PTTT and their component parts for predicting inpatient deterioration? ► Question 2: how effective are paediatric early warning systems (with or without a PTTT) at reducing mortality and critical events?

MEthODs
This systematic review is reported in accordance with the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) guidelines. 8 (see online supplementary table 2). Papers reporting development of validation of a PTTT were included for question 1, whereas papers reporting the implementation of any broader 'paediatric early warning system' (with or without a PTTT) were eligible for question 2. Both research questions were limited to studies that involved inpatients aged 0-18 years. Outcome measures considered were mortality and critical events, including: unplanned admission to a higher level of care, cardiac arrest, respiratory arrest, medical emergencies requiring immediate assistance, children reviewed by paediatric intensive care unit (PICU) staff on the ward (in specialist centres) or reviewed by external PICU staff (for non-specialist centres), acuity at PICU admission and PICU outcomes. A range of study designs were considered for both questions. Two of the review authors independently screened the titles and abstracts yielded in the search. Full texts were reviewed independently by six reviewers against the above eligibility criteria and were assigned to the relevant review question if included. Reasons for exclusion were recorded. Separate data extraction forms were developed for validation and effectiveness studies. The forms had common elements (study design, country, setting, study population, description of the PTTT or early warning system, statistical techniques used, outcomes assessed). Additional data items for validation studies included Open access the items in the PTTT, modifications to the PTTT from previous versions, predictive ability of individual items and the overall tool, sensitivity and specificity and inter-rater and intra-rater reliability. Effectiveness studies included an assessment of outcomes in terms of mortality and various morbidity variables. Data extraction was carried out by two reviewers and discrepancies were resolved by discussion. For effectiveness studies, effect sizes and 95% CIs were calculated or reported as risk ratios (RR) or ORs as appropriate, with p values reported to assess statistical significance. Data analysis was conducted using an online medical statistics tool.

Quality appraisal
Methodological quality and risk of bias was assessed for each included study using a modified version of the Downs and Black rating scale 9 (templates shown in online supplementary table 3).

Patient and public involvement
This review was conducted as part of a larger mixedmethods study (ISRCTN94228292), which used a formal, facilitated parental advisory group. The group comprised parents of children who had experienced an unexpected adverse event in a paediatric unit and provided input which helped to shape the broader research questions and outcome measures. The results of the review will be disseminated to parents through this group.
rEsults Figure 1 shows the PRISMA flow diagram for both research questions. Table 1 summarises the study characteristics of validation and effectiveness papers in the review.

types of Ptts and components
Across 66 studies, we identified 27 unique PTTT (table 2). Twenty PTTTs were based on one of four different tools: Monaghan's Brighton PEWS, 10 the Bedside PEWS, 11 the Bristol PEWT 12 and the Melbourne Activation Criteria (MAC). 13 Other PTTT described in the literature included the National Health Service Institute for Innovation and Improvement (NHS III) PEWS 14 (the second most commonly used PTTT in UK paediatric settings 5 ), RRT and MET activation criteria [15][16][17][18] and one prediction algorithm developed from a large dataset of electronic health data. 19 Table 2 illustrates the range of physiological and behavioural parameters underpinning PTTT. Common parameters included heart rate (present in 26 out of 27 PTTT), respiratory rate (24), respiratory effort (24) and level of consciousness or behavioural state (24). All PTTT required at least six different parameters to be collected.
Question 1: how well validated are Pttt and component parts for predicting inpatient deterioration? Nine validation papers meeting inclusion criteria were excluded from analysis: eight did not report any performance characteristics of the PTTT for predicting deterioration [20][21][22][23][24][25][26][27] and one study calculated incorrect sensitivity/specificity outcomes 12 (see online supplementary table 4). The remaining 27 validation studies, evaluating the performance of 18 unique PTTT, are described  in table 3. Four studies evaluated multiple PTTTs 3 19 28 29 and one paper described three separate studies of the same PTTT. 30 Five cohort studies were included, 14 31-34 three based on the same dataset. All other studies were either case-control or chart reviews. Thirteen papers implemented the PTTT in practice, 23 30 31 34-43 while the remaining studies 'bench tested' the PTTT-researchers retrospectively calculated the score based on data abstracted from medical charts and records. All studies were conducted in specialist centres with only one multicentre study reported. 44 Outcome measures PTTT were evaluated for their ability to predict a wide range of clinical outcomes. Composite measures were used in 8 studies, 14  Predictive ability of individual Pttt components Three validation papers reported on the performance characteristics of individual components of the tool for predicting adverse outcomes. 11 33 42 Parshuram et al, for instance, reported area under the receiver operating characteristic curve (AUROC) values for individual PTTT items of a pilot version of the Bedside PEWS: ranging from 0.54 (bolus fluid) to 0.81 (heart rate), compared with 0.91 for the overall PTTT. 11 All other studies reported outcomes for the PTTT as a whole.

PEWs score
The predictive ability of the 16-item PEWS score was assessed by one internal 47 (AUROC=0.90) and two external case-control studies 28 29 (AUROC range=0.82-0.88) with a range of outcome measures and scoring thresholds. One case-control study used an observed prevalence rate to calculate a positive predictive value (PPV) of 4.2% for the tool in predicting code calls 47 (for every 1000 patients triggering the PTTT, 42 would be expected to deteriorate).
bedside PEWs and derivatives The Bedside PEWS was evaluated in one internal 11 (AUROC=0.91) and five external case-control studies 19 45 48 50 The Modified Brighton PEWS (a) was evaluated for its ability to predict PICU transfers in one large prospective cohort study (AUROC=0.92, PPV=5.8%), 31 and an external case-control study tested the same score for predicting urgent PICU transfers within 24 hours of admission (AUROC=0.74, PPV=2.1%). 19 An external case-control study used a composite measure of death, arrest or PICU transfer to evaluate the Modified Brighton PEWS (b) (AUROC=0.79) and the Modified Brighton PEWS (d) (AUROC=0.74). 29 The latter tool was evaluated in a further internal case-control study for predicting PICU transfer (AUROC=0.82). 48

Open access
The Children's Hospital Early Warning Score (CHEWS) had a reported AUROC of 0.90 for predicting PICU transfers or arrests in a large internal case-control study. 50 A modification for cardiac patients, the Cardiac CHEWS (C-CHEWS) was evaluated by one internal study on a cardiac unit 37 (AUROC=0.90) looking at arrests or unplanned PICU transfers, and two external studies of oncology/haematology units 41 42 for the same outcome (AUROC=0.95). Finally, the Children's Hospital Los Angeles PEWS was evaluated by in a small internal case-control study for prediction of re-admission to PICU after initial PICU discharge 40 (AUROC=0.71).

MAC and derivatives
The MAC was assessed by one external case-control study with an outcome of death, arrest or unplanned PICU transfer 29 (AUROC=0.71) and a large external cohort study with an outcome of death or unplanned Open access Skin colour Airway problems Temperature Pulses Family concern Other items Paediatric early warning system (PEWS) score and derivatives PEWS score 28 47  added family and staff concern; added age-related thresholds; removed nebulisers and vomiting items; modified escalation algorithm.

Score
Expert opinion   Across PTTT, studies reporting performance characteristics of a tool at a range of different scoring thresholds demonstrate the expected interaction and trade-off between sensitivity and specificity-at lower triggering thresholds, sensitivity is high but specificity is low; at higher thresholds, the opposite is true.
Inter-rater reliability and completeness of data Accurate assessment of the ability of a PTTT to predict clinical deterioration is contingent on accuracy and reliability of tool scoring (whether by bedside nurses in practice or by researchers abstracting data) and the availability of underpinning observations. Only five papers made reference to accuracy or reliability of scoring, 28 31 37 42 45 with mixed results: for example, two nurses separately scoring a subset of patients on the Modified Brighton PEWS (a) achieved an intra-class coefficient of 0.92, 31 but a study nurse and bedside nurse achieved only 67% agreement in scoring the C-CHEWS tool. 37 Completeness of data was reported in 11 studies. 11   No details on extent of missing data but authors report that 'missing data were a major cause of incorrect prediction'.

Open access
Question 2: how effective are early warning systems at reducing mortality and critical events in hospitalised children?
Eleven papers meeting inclusion criteria were excluded from analysis for providing insufficient statistical information (eg, denominator data, absolute numbers of events) to calculate effect sizes. 39  type of early warning system interventions Seventeen interventions involved the introduction of a new PTTT, 13 15-18 60-72 one intervention introduced a mandatory triggering element to an existing PTTT 71 and one study reported a large, multicentre analysis of MET introduction with no details on PTTT use. 73 Twelve interventions included the introduction of a new MET or RRT, 13 15-18 60-65 69 while four further interventions introduced a new PTTT in a hospital with an existing MET or RRT. Only three studies therefore evaluated a PTTT in the absence of a dedicated response team. 67 68 70 A staff education programme was explicitly described in 10 interventions. 13 15 17 61 62 64 67 68 70 72 Of the 18 studies that used a PTTT, only 7 used a tool that had been formally evaluated for validity: 3 used the Bedside PEWS, 64 65 70 2 used the MAC, 13 62 1 used the Modified Brighton PEWS (b) 72 and 1 used the C-CHEWS. 67 One study did not report the PTTT used, 61 while 10 studies used a variety of calling criteria and local modifications to validated tools that had not been evaluated for validity. 15-18 60 63 66 68 69 71 Mortality (ward or hospital wide) Two uncontrolled before-after studies (both with MET/ RRT) reported significant mortality rate reductions postintervention: one in hospital wide deaths per 100 discharges 17 (RR=0.82, 95% CI 0.70 to 0.95) and one in total hospital deaths per 1000 admissions (RR=0.65, 95% CI 0.57 to 0.75) and deaths on the ward ('unexpected deaths') per 1000 admissions 62 (RR=0.35, 95% CI 0.13 to 0.92). Seven studies found no reductions in mortality, including two high-quality multicentre studies. 13 15 60 63-65 73 Parshuram et al conducted a cluster randomised trial and found no difference in all-cause hospital mortality rates between 10 hospitals randomly selected to receive an intervention centred around use of the Bedside PEWS and 11 usual care hospitals, 1-year postintervention (OR=1.01, 95% CI 0.61 to 1.69). 64 Kutty et al 73  All studies conducted in a specialist/tertiary centre. PPV and NPV values in italics represent results from case-control studies-these values are misleading in isolation because they assume that the wider prevalence rate of the adverse event is equal to the case to control ratio used in the research study (eg, if the researchers studied 300 cases and 300 controls, the prevalence rate of adverse events for the calculation of PPV is 50%). As per the cohort studies, prevalence rates of critical events are typically far lower among hospitalised paediatric populations than the case-control ratios used in studies, and so PPV values would be considerably lower in clinical practice.
Studies classified as internal validation if the setting for the study was the same hospital and same research team as those who developed the score. Studies classified as external validation if the score was tested in a different centre and by a different research team to those who developed it. *Typically, study researchers collected or abstracted multiple PTTT scores for each patient at different time points, but can only use one score per patient for the analysis of the tool's predictive ability. This column specifies which score the researchers used. In most cases, the study team used the maximum PTTT score recorded for each patient in a given study window, eg, 24 hours prior to a critical event for case patients. The text in parentheses describes the frequency with which scores were assessed or abstracted for each patient, if this information was described in the paper. †Case-control study, but PPV value calculated based on clinical prevalence of event as measured at local centre during the study.

Table 3 Continued
Open access   63 Six studies (including a high-quality cluster randomised trial and interrupted time series study) reported no postintervention change in PICU mortality using a variety of metrics. [64][65][66][67][68][69] Cardiac and respiratory arrests Two uncontrolled before-after studies (both with RRT/ MET) reported significant postintervention rate reductions in subcategories of cardiac arrests: one in 'near cardiopulmonary arrests' 63

PICu transfers
One uncontrolled before-after study (without RRT/MET) found a significant postintervention decrease in the rate of unplanned PICU transfers per 1000 patient-days 67 (RR=0.70, 95% CI 0.56 to 0.88). Four studies (including one high-quality cluster randomised trial and one highquality interrupted time series study) found no change in rates of PICU admissions postintervention. 64-66 70 PICu outcomes Two studies, one interrupted time series and one multicentre cluster randomised trial (both with RRT/MET), found significant reductions in rates of 'critical deterioration events' (life-sustaining interventions administered within 12 hours of PICU admission) relative to preimplementation trends and relative to control hospitals, respectively (IRR=0.38, 95% CI 0.20 to 0.75; OR=0.77, 95% CI 0.61 to 0.97). 64 65 One controlled before-after study (without RRT/MET) reported a significant reduction in rates of invasive ventilation given to emergency PICU admissions postintervention (RR=0.83, 95% CI 0.72 to 0.97), with no significant change observed in a control group of patients admitted to PICU from outside of the hospital. 68 One uncontrolled before-after study reported a significant postintervention decrease in rates of PICU admissions receiving mechanical ventilation (RR=0.85, 95% CI 0.73 to 0.99), but an increase in rates of early intubation (RR=1.87, 95% CI 1.33 to 2.62). 69

Implementation outcomes
Only three studies reported outcomes relating to the quality of implementation of the intervention. One study reported 99% of audited observation sets of the Bedside PEWS had at least five vital signs present postintervention, up from 76% preintervention (no change in control hospitals). 64 A previous study of the same PTTT reported 3% of audited cases had used the incorrect age chart but reported an intraclass coefficient of 0.90 for agreement between bedside nurses scoring the PTTT in practice and research nurses retrospectively assigned scores. 70 Finally, error rates in C-CHEWS scoring were reported to have reduced from an initial 47% to below 10% by the end of the study. 67 DIsCussIOn This paper reviewed the published PTTT and early warning system literature in order to assess the validity of PTTT for predicting inpatient deterioration (question 1) and the effectiveness of early warning system interventions (with or without PTTT) for reducing mortality and morbidity outcomes in hospitalised children (question 2). We believe that the consideration of broader 'early warning systems' differentiates this paper from previous reviews, as does the inclusion of two recently published high-quality effectiveness studies. 64

Open access
However, methodological issues common to the validation studies mean that such results need to be interpreted with a degree of caution. First, each of the studies was conducted in a clinical setting where paediatric inpatients are subject to various forms of routine clinical intervention throughout their admission. There are numerous statistical modelling techniques which can account for co-occurrence of clinical interventions and the longitudinal nature of the predictors, 74 75 but none of these were used in the validation studies and so estimates of predictive ability are likely to be distorted. Indeed, the majority of outcomes used in the validation studies are clinical interventions themselves (eg, PICU transfer). Second, while it understandable that a majority of studies 'bench-tested' the PTTT rather than implement it into practice before evaluation, the process of abstracting PTTT scores retrospectively from patient charts and medical records introduces a number of sources of potential bias or inaccuracy. For instance, several studies reported either high levels of missing data (ie, some of the observations required to populate the PTTT score being evaluated were not routinely collected or recorded and so were scored as 'normal') 11 19 32 44 45 or difficulty in abstracting certain descriptive or subjective PTTT components. 19 28 41 49 Assuming missing values are normal, or excluding some PTTT items for analysis are both likely to result in underscoring of the PTTT and skew the results. Finally, studies which evaluated a PTTT that had been implemented in practice are at risk of overestimating the ability of PTTT to predict proxy outcomes such as PICU transfer, inasmuch as high PTTT scores or triggers automatically direct staff towards escalation of care, or clinical actions which make escalation of care more likely.
The findings reported in several PTTT studies point towards two potential challenges for some centres in implementing and sustaining a PTTT in clinical practice. As noted above, a number of studies that retrospectively 'bench-tested' a PTTT reported that the observations that were required to score the tool were not always routinely collected or recorded in their centre. It may be that the introduction of a PTTT into practice would help create a framework to ensure that core vital signs and observations were collected more routinely (as demonstrated by Parshuram et al 64 ), but this would obviously have resource implications that could be a potential barrier for some centres. Such considerations are important, as evidence from the adult literature points to the potential for tools to inadvertently mask deterioration when core observations are missing. 76 Second, PPV values reported in cohort studies, and case-control studies that adjusted for outcome prevalence, were uniformly low (between 2.3% and 5.9%). 14 19 31-33 47 They demonstrate that even PTTT which demonstrate good predictive performance are likely to generate a large amount of 'false alarms' because adverse outcomes are so rare. For some centres, these issues may be mitigated to some extent by dedicated response teams or other available resources, but other hospitals may not be able to sustain the increased workload of responding to PTTT triggers.
how effective are early warning systems for reducing mortality and morbidity? We found limited evidence for early warning system interventions reducing mortality or arrest rates in hospitalised children. While some effectiveness papers did report significant reductions in rates of mortality (on the ward or in PICU) or cardiac arrests after implementation of different early warning system interventions, 16-18 62 63 they were all uncontrolled before-after studies which have inherent limitations in terms of establishing causality. They do not preclude the possibility that outcome rates would have improved over time regardless of the intervention 77 or changes were caused by other factors, and their inclusion is accordingly discouraged by some Cochrane review groups. 78 Three high-quality multicentre studiestwo interrupted time series studies and a recent cluster randomised trial-found no changes in rates or trends of mortality or arrests postintervention. 64 65 73 There was also limited evidence for early warning systems reducing PICU transfers or calls for urgent review. Again, a small number of uncontrolled beforeafter studies reported significant reductions postintervention, 15 17 63 but several other studies reported significant increases in transfers or calls for review 66 72 or no postintervention changes. We did find moderate evidence across four studies-including a controlled before-after study, a multicentre interrupted time series study and a multicentre cluster randomised trial-for early warning system interventions reducing rates of early critical interventions in children transferred to PICU. 64 65 68 69 Such results are promising, but corresponding reductions in hospital or PICU mortality rates have not yet been reported.
Implementing complex interventions in a healthcare setting is challenging and evidence from the adult literature points to challenges and barriers to successfully implement TTT in practice. [79][80][81] However, given so few effectiveness studies reported on implementation outcomes, it is difficult to know whether negative findings reflect poor effectiveness or implementation of early warning systems. Again, effectiveness studies were predominantly carried out in specialist centres-and in all but three cases, 67 68 70 involved the use of a dedicated response team-which greatly limits the generalisability of findings outside of these contexts.
limitations of the review There are several limitations of the current review. First, despite purposely widening the scope of the effectiveness review question to include paediatric 'early warning systems' with or without a PTTT, we identified very few studies that did not employ a PTTT as part of the intervention. In part, this likely reflects the fact that PTTT have become almost synonymous with early warning systems, but it is also possible that our search strategy may have missed some broader early warning system initiatives that

Open access
were not explicitly labelled as such. Second, our inclusion criteria for study selection were deliberately broad and so resulted in our including several validation and effectiveness studies that were subsequently excluded from analysis due to insufficient statistical detail or methodological issues. Third, the scope of the current review was limited to consideration of quantitative validation and effectiveness studies. We are mindful of research suggesting that implementing PTTT in practice may confer secondary benefits including, but not limited to improvements in communication, teamwork and empowerment of junior staff to call for assistance. [82][83][84] Finally, we opted not to conduct a meta-analysis of effectiveness findings due to the heterogeneity of outcome metrics, interventions and study designs, populations and settings. Given the large sample sizes required to detect changes in rare adverse events, we believe further work is needed to harmonise outcome measures used to evaluate early warning system interventions internationally, in order to facilitate pooling of findings across studies.

COnClusIOn
The PTTT literature is currently characterised by an 'absence of evidence' rather than an 'evidence of absence'. PTTT seem like a logical tool for helping staff detect and respond to deteriorating patients, but the existing evidence base is too limited to form clear judgements of their utility. We would argue that there has been too much confidence placed in the statistical findings of validation studies of PTTT, given methodological limitations in the study designs. There is evidence of consistently high false-alarm rates and bench-testing studies point to many PTTT parameters not being reliably recorded in practice: as such there is reason for caution in considering the viability of PTTT for all hospitals. Almost all of the early warning systems and PTTT reported in the literature have been developed and evaluated in specialist centres, typically in units with access to dedicated response teams-yet PTTT appear to be commonly adopted by non-specialist units with little modification. There is currently limited evidence that 'early warning systems' incorporating a PTTT reduce deterioration or death in practice. As such, we would urge caution among policymakers in calling for their use to become mandatory across all hospitals. We acknowledge the potential for PTTT to confer a range of secondary benefits in areas such as communication, teamwork and empowerment of junior staff. More work is required to understand the wider impact of PTTT implementation in different clinical settings before it is possible to evaluate their overall contribution to the wider safety mechanisms and systems aimed at identifying and responding to deteriorating in paediatric patients.