Article Text

Download PDFPDF

Performance of externally validated enhanced computer-aided versions of the National Early Warning Score in predicting mortality following an emergency admission to hospital in England: a cross-sectional study
  1. Muhammad Faisal1,2,
  2. Donald Richardson3,
  3. Andy Scally4,
  4. Robin Howes5,
  5. Kevin Beatson6,
  6. Mohammed Mohammed1,7
  1. 1 Faculty of Health Studies, University of Bradford, Bradford, UK
  2. 2 Bradford Institute for Health Research, Bradford Royal Infirmary, Bradford, UK
  3. 3 Renal Unit, York Teaching Hospital NHS Foundation Trust, York, UK
  4. 4 School of Clinical Therapies, University College Cork National University of Ireland, Cork, Ireland
  5. 5 Department of Strategy & Planning, Northern Lincolnshire and Goole Hospitals NHS Foundation Trust, Grimsby, UK
  6. 6 IT Department, York Teaching Hospital NHS Foundation Trust, York, UK
  7. 7 NHS Midlands and Lancashire Commissioning Support Unit, West Bromwich, UK
  1. Correspondence to Dr Mohammed Mohammed; m.a.mohammed5{at}bradford.ac.uk

Abstract

Objectives In the English National Health Service, the patient’s vital signs are monitored and summarised into a National Early Warning Score (NEWS) to support clinical decision making, but it does not provide an estimate of the patient’s risk of death. We examine the extent to which the accuracy of NEWS for predicting mortality could be improved by enhanced computer versions of NEWS (cNEWS).

Design Logistic regression model development and external validation study.

Setting Two acute hospitals (YH—York Hospital for model development; NH—Northern Lincolnshire and Goole Hospital for external model validation).

Participants Adult (≥16 years) medical admissions discharged over a 24-month period with electronic NEWS (eNEWS) recorded on admission are used to predict mortality at four time points (in-hospital, 24 hours, 48 hours and 72 hours) using the first electronically recorded NEWS (model M0) versus a cNEWS model which included age+sex (model M1) +subcomponents of NEWS (including diastolic blood pressure) (model M2).

Results The risk of dying in-hospital following emergency medical admission was 5.8% (YH: 2080/35 807) and 5.4% (NH: 1900/35 161). The c-statistics for model M2 in YH for predicting mortality (in-hospital=0.82, 24 hours=0.91, 48 hours=0.88 and 72 hours=0.88) was higher than model M0 (in-hospital=0.74, 24 hours=0.89, 48 hours=0.86 and 72 hours=0.85) with higher Positive Predictive Value (PPVs) for in-hospital mortality (M2 19.3% and M0 16.6%). Similar findings were seen in NH. Model M2 performed better than M0 in almost all major disease subgroups.

Conclusions An externally validated enhanced computer-aided NEWS model (cNEWS) incrementally improves on the performance of a NEWS only model. Since cNEWS places no additional data collection burden on clinicians and is readily automated, it may now be carefully introduced and evaluated to determine if it can improve care in hospitals that have eNEWS systems.

  • decision making
  • health informatics
  • health & safety
  • statistics & research methods

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

View Full Text

Statistics from Altmetric.com

Strengths and limitations of this study

  • Under current guidelines, National Early Warning Score (NEWS) is not used to provide an explicit estimate of the patient’s risk of death.

  • This study provides a computer-aided NEWS (cNEWS) model in predicting the risk of mortality for emergency medical admissions, so this limits the extension of our findings to non-medical and elective admissions.

  • cNEWS models are externally validated and places no additional data collection burden on clinicians and are readily automated.

  • We used the index NEWS data in our cNEWS models, which reflect the ‘on-admission’ risk of mortality of the patient.

  • NEWS is repeatedly updated for each patient according to local hospital protocols, and the extent to which changes in NEWS over time reflect changes in mortality risk that need to be incorporated in our cNEWS models needs further study.

Introduction

Early warning scores (EWS) are widely used in hospitals worldwide. The key rationale is that EWS facilitate earlier detection of deteriorating patients, a major cause of poor care, and so may enhance the quality, safety and outcomes of care.1

In National Health Service (NHS) hospitals in England, the patient’s vital signs are monitored and summarised into a National Early Warning Score (NEWS). NEWS, which was designed for paper-based implementation, is derived from scoring seven physiological variables or vital signs—respiration rate, oxygen saturations, any supplemental oxygen, temperature, systolic blood pressure, heart rate and level of consciousness (Alert, Voice, Pain, Unresponsive)—which are routinely collected by nursing staff as an integral part of the process of care. The vital signs are measured by clinical staff and manually entered into the electronic healthcare record. The NEWS is calculated automatically by the computer system. NEWS was found to outperform 33 other EWS in predicting patients at risk of cardiac arrest, unanticipated intensive care unit admission or death within 24 hours.2

NEWS was launched for the NHS in 2012 and by 2017, almost 8 out of 10 NHS Trusts report using NEWS or adaptations of NEWS.3 Moreover, there is widespread interest in NEWS from across the world, including Europe, India, the USA (and the US Navy).4 Although NEWS was designed for paper-based implementation, several studies have shown that paper-based (N)EWS are unreliable and that electronically collected (N)EWS5 are highly reliable and accurate6–8—about two-thirds of NHS Trusts use some form of electronic NEWS (eNEWS).3 Furthermore, under current guidelines, NEWS is not used to provide an explicit estimate of the patient’s risk of death.

Given the increased role of informatics in healthcare, we postulated that the performance of NEWS could be improved by developing enhanced computer-aided versions of NEWS (cNEWS) that incorporate the patient’s age and sex along with the individual subcomponents of NEWS. We examine the performance of cNEWS models based on the first recorded eNEWS to predict the risk of death in hospital (within 24 hours, 48 hours, 72 hours and in-hospital) following emergency medical admission, in general and for specific diseases, in two different hospitals. We developed three logistic regression models for the risk of in-hospital mortality: Model M0 uses the index NEWS alone and reflects current practice guidelines4; model M1 extends M0 with age and sex as additional covariates; and model M2 extends M1 with all the components of NEWS and diastolic blood pressure as additional covariates with transformation as specified above.

Methods

Setting and data

Our cohorts of emergency medical admissions are from three acute hospitals which are approximately 100 km apart in the Yorkshire and Humberside region of England—the Diana, Princess of Wales Hospital (n~400 beds) and Scunthorpe General Hospital (n~400 beds) managed by the Northern Lincolnshire and Goole (NLAG) NHS Foundation Trust (NLAG), and York Hospital (YH) (n~700 beds) managed by the York Teaching Hospital NHS Foundation Trust. For the purposes of this study, the two acute hospitals in NLAG are combined into a single data set and collectively referred to as NLAG Hospitals (NH). So, in essence, our study is based on data from two hospitals—NH and YH, respectively, which have been exclusively using eNEWS scoring since at least 2013 as part of their in-house electronic patient record systems. We selected these hospitals because they had access to electronically recorded vital signs and electronically calculated NEWS, which are collected as part of the patient’s process of care, and were agreeable to the study. Furthermore, studies have shown that electronically collected NEWS5 are highly reliable and accurate when compared with paper-based methods.6–8

We considered all adult (age≥16 years) emergency medical admissions, discharged during a 24-month period (01 January 2014 to 31 December 2015), with eNEWS. For each emergency admission, we obtained a pseudonymised patient identifier, patient’s age (years), gender (male/female), discharge status (alive/dead), admission and discharge date and time, and eNEWS (including its subcomponents: respiratory rate, temperature, systolic pressure, pulse rate, oxygen saturation, oxygen supplementation and alertness). The diastolic blood pressure was recorded at the same time as systolic blood pressure. Historically, diastolic blood pressure has always been a routinely collected physiological variable on vital sign charts and is still collected where electronic observations are in place. NEWS does not include diastolic blood pressure but we incorporate it in our statistical models because this data item is routinely collected. NEWS produces integer values that range from 0 (indicating the lowest severity of illness) to 20 (the maximum NEWS value possible) (see online supplementary file for further details). The index NEWS was defined as the first electronically recorded NEWS within ±24 hours of the admission time. We excluded records where the index eNEWS was not within ±24 hours or was not recorded at all (see online supplementary table S1).

Disease subgroups

We identified clinical disease subgroups using two approaches—Clinical Classifications Software (CCS) and Charlson Comorbidity Index (CCI). The CCS approach, which has been widely used,9 10 is developed by the Agency for Healthcare Research and Quality11 which collapses over 69 800 ICD-10-CM diagnosis codes (International Classification of Diseases, 10th revision, Clinical Modification) into 283 clinically meaningful categories based on the primary diagnoses. After removing the external cause of injury codes (ie, suicide and self-inflicted injury) (CCS code>259) from our data set, we were left with 60 671 ICD-10-CM codes which fell into 247 disease groups. We subsequently considered only those CCS codes (n=17) which had the highest mortality.

The CCI approach is based on comorbidities which are diseases that coexist with the index disease, which are likely to affect the prognosis of the disease of interest and influence the choice of treatment.12–14 The CCI is a widely used comorbidity index12 15 identifying comorbidities such as diabetes with diabetic complications, congestive heart failure, peripheral vascular disease, chronic pulmonary disease, mild and severe liver disease, hemiplegia, renal disease, leukaemia and AIDS, each of which was weighted according to their potential influence on mortality. The CCI has been adapted and verified as applicable and valid for predicting the outcome and risk of death from many comorbid diseases.16 17 In our data, the CCI produced 17 diagnoses subgroups using the secondary diagnoses ICD-10 codes based on the algorithm developed by Quan et al 18 and implemented in Stata by Stagg.19 We excluded the AIDS category because it had less than 10 events. So, for each hospital, we have 16 CCI disease groups. We reported the statistical differences in characteristics of our two hospitals using two independent sample t-test (for continuous data) and χ2 proportion test (for categorical data).

Statistical modelling

We began with exploratory analyses including scatter plots and box plots that showed the relationship between covariates and risk of in-hospital death in our hospitals. We used the qladder function (Stata 14 20) which displays the quantiles of transformed variable against the quantiles of a normal distribution according to the ladder powers Embedded Image for each continuous covariate and chose the following transformations: loge(respiratory rate), loge(pulse rate), loge(systolic blood pressure) and loge(diastolic blood pressure).

We developed three logistic regression models for the risk of in-hospital mortality: Model M0 uses the index NEWS alone and reflects current practice guidelines4; model M1 extends M0 with age and sex as additional covariates; and model M2 extends M1 with all the components of NEWS and diastolic blood pressure as additional covariates with transformation as specified above. We called them cNEWS models. We used likelihood ratio (LR) tests to determine the extent to which progressing from models M0 to M2 improved the goodness of fit.

All models were developed to predict the risk of in-hospital mortality following emergency medical admission using data from YH and were externally validated using data from another hospital (NH) and performance assessed via model discrimination and calibration characteristics.21 We determined the sensitivity, specificity, positive predictive values and negative predictive values for these models at NEWS≥5 (equivalent to mortality risks of 0.08 in YH and 0.09 in NH) thresholds for in-hospital mortality using the ROCR.22 The cut-off of NEWS≥5 is the recommended threshold for escalation of care.23 24 We also report positive and negative LRs which which are given by: LR+=sensitivity/1−specificity and LR−=1−sensitivity/specificity. Unlike positive predictive values, LRs are not dependent on prevalence. LRs equal to 1 are clinically uninformative and LRs furthest away from 1 are clinically more informative.25

Discrimination relates to how well a model can separate (or discriminate between) those who died and those who did not. Calibration measures a model’s ability to generate predictions that are on average close to the average observed outcome. The concordance statistic (c-statistic) is a commonly used measure of discrimination. For a binary outcome (alive/died), the c-statistic is the area under the receiver operating characteristics (ROC) curve. The ROC curve is a plot of the sensitivity (true positive rate) versus 1−specificty (false positive rate) for consecutive predicted risks. A c-statistic of 0.5 is no better than tossing a coin, while a perfect model has a c-statistic of 1. In general, values less than 0.7 are considered to show poor discrimination, values of 0.7–0.8 can be described as reasonable and values above 0.8 suggest good discrimination.26 We report the c-statistic after adjusting for differences in the baseline27 risk of in-hospital mortality as the models were developed on using data from YH and externally validated using data from another hospital (NH). The 95% CI for the c-statistic was derived using DeLong’s method as implemented in the pROC library28 in R.29

For calibration, we report the calibration slope (the ideal value is 1) which shows the extent to which observed and model-predicted risks are consistent with each other (the ideal calibration slope is 1, >1 indicates underfitting and <1 indicates overfitting).

We followed the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) guidelines for model development and validation.30 All analyses were carried using R 29 and Stata.20

Patient and public involvement

A workshop with a patient and service user group, linked to the University of Bradford, was involved at the start of this project to co-design the agenda for the patient and staff focus groups which were subsequently held at each hospital site. Patients were invited to attend the patient focus group through existing patient and public involvement groups. The criterion used for recruitment to these focus groups was any member of the public who had been a patient or carer in the last 5 years. The patient and public voice continued to be included throughout the project with three patient representatives invited to sit on the project steering group. Participants will be informed of the results of this study through the patient and public involvement leads at each hospital site and the project team have met with the Bradford patient and service user group to discuss the results.

Data sharing statement

Our data sharing agreement with the two hospitals (YH and NH) does not permit us to share this data with other parties. Nonetheless, if anyone is interested in the data, then they should contact the R&D offices at each hospital in the first instance.

Results

Cohort description

The number (YH: n=36 751; NH: n=37 100) of emergency medical admissions over a 24-month period was similar in our two hospitals. We excluded 2.6% (944/36 751) of admissions in YH and 5.2% (1939/37 100) in NH because the index eNEWS was not recorded within 24 hours of the admission date/time or there was no eNEWS recorded at all (see online supplementary table S1).

The age, sex and NEWS profiles of our cohort of emergency admissions is presented in table 1. The in-hospital mortality was similar in YH (5.8%, 2080/35 807) and NH (5.4%, 1900/35 161). Emergency admissions in the YH were slightly older with slightly higher index NEWS. There was a marked difference in the use of oxygen supplementation in the two hospitals (YH: 11.3% vs NH: 19.2%).

Table 1

Characteristics of emergency medical admissions in the YH and NH hospitals

The relationship between the individual vital signs in NEWS and in-hospital mortality is shown in online supplementary figures S1–S4. Online supplementary figure S5 shows the relationship between the index NEWS and in-hospital mortality in each hospital.

Statistical modelling

We developed three cNEWS models (M0, M1 and M2) to predict the risk of in-hospital mortality on data from YH and externally validated the performance of these models using data from NH (see online supplementary table S2). The c-statistic for each model in internal validation increased across the models (M0 0.742, M1 0.808 and M2 0.821). Similar patterns were seen during external validation (M0 0.756, M1 0.815 and M2 0.826). The improvements in c-statistics were mainly attributable to the inclusion of age (see online supplementary table S3). The external validation slope reduced across models from 1.16 (M0) to 1.05 (M2) indicating improved model fits (ideal calibration is 1). The LR test showed statistically significant improvement in model goodness of fits (M0 vs M1: χ2=984.6 (df=2) p<0.001; M1 vs M2: χ2=329.6 (df=10) p<0.001).

Figure 1 shows the ROC curves of all three cNEWS models in both hospitals (YH and NH) at four time points (mortality within 24 hours, 48 hours, 72 hours and in-hospital mortality) and online supplementary table S2 presents their accompanying discrimination and calibration slopes. For in-hospital mortality, model M0 had the lowest area under the curve (YH: 0.742 and NH: 0.756), and model M2 had the highest (YH: 0.821 and NH: 0.826). These differences were less marked for mortality in 24, 48 and 72 hours, but in each case, model M2 had the highest area under the curve (see online supplementary table S2).

Figure 1

Receiver operating characteristic curve for three NEWS models in predicting the risk of in-hospital mortality, mortality within 24 hours, 48 hours and 72 hours in the YH and NH hospitals. Black line is for NEWS only model (M0) and red line is for where NEWS adjusted by age and sex (M1). Green line is for where NEWS adjusted by age, sex and NEWS subcomponents (M2). NEWS, National Early Warning Score; NH, Northern Lincolnshire and Goole Hospitals; YH, York Hospital.

Online supplementary table S4 presents various performance characteristics at NEWS≥5 for all three NEWS models (M0, M1 and M2) for predicting the risk of mortality within 24 hours, 48 hours, 72 hours and in-hospital mortality.

Model M0 had the lowest sensitivity 52.5%, specificity 83.7% and positive predictive value 16.6% for in-hospital mortality. Model M2 had the highest sensitivity 60.0%, specificity 84.5% and positive predictive value 19.3% for in-hospital mortality. Similar patterns were seen for mortality within 24, 48 and 72 hours, although the positive predictive values were much lower because of the lower risk of death at these time points (see table 1).

The performance of the three cNEWS models in predicting the in-hospital mortality for disease subgroups is shown in figures 2 and 3.

Figure 2

Showing c-statistic (+95% CI) for 17 CCS disease groups (left) and 16 CCI disease groups (right) for M0 (black), M1 (red) and M2 (brown) in predicting in-hospital mortality in each hospital (YH and NH). C-statistic, concordance statistic; CCI, Charlson Comorbidity Index; CCS, Clinical Classifications Software; NEWS, National Early Warning Score; NH, Northern Lincolnshire and Goole Hospitals; YH, York Hospital.

Figure 3

Showing positive predictive value (+95% CI) for 17 CCS disease groups (left) and 16 CCI disease groups (right) for M0 (black), M1 (red) and M2 (green) in predicting in-hospital mortality in each hospital (YH and NH). Blank rows on the graph indicate nil admissions. CCI, Charlson Comorbidity Index; CCS, Clinical Classifications Software; NEWS, National Early Warning Score; NH, Northern Lincolnshire and Goole Hospitals; YH, York Hospital.

Figure 2 plots the c-statistics of each model for various disease subgroups and shows that, overall, models M1 and M2 performs better than M0 (NEWS alone) model.

Figure 3 shows the positive predictive value of each model for various disease categories based on CCI and CCS in both hospitals. Online supplementary table S5–S7 present the mortality and c-statistic for each CCI and CCS disease group for each hospital.

In particular, cNEWS model outperformed in terms of both high c-statistic and positive predictive value for disease groups such as pneumonia, diabetes, Chronic Obstructive Pulmonary Disease (COPD) and cancer. There is no improvement in terms of positive predictive value, but higher c-statistics for disease groups such as renal disease, dementia and heart diseases.

Discussion

Although NEWS was launched in 2012 for paper-based implementation about two-thirds of hospitals in the NHS have implemented NEWS electronically.3 Under current guidelines, NEWS is not used to provide an explicit estimate of the patient’s risk of death. We propose to use cNEWS models to provide this and show that model M2 is best in class. We have demonstrated that these findings are generalisable. We have not altered the basic NEWS model which has standardised implementation. However, our cNEWS models now allow clinical staff to augment the risk of death with the NEWS score using a valid model. The complexity of the cNEWS model is ‘hidden’ from clinical staff and so no additional data or calculations are required. Nonetheless, the impact of this change in practice needs careful and rigorous evaluation.

Our findings show how enhanced cNEWS, which incorporate age, sex, diastolic blood pressure and the subcomponents of NEWS, can build on the success of NEWS by offering incremental improvements in the accuracy of mortality risk prediction in general and across a wide range of diseases. An attractive design feature of cNEWS is that although it is much more complex than NEWS alone (see online supplementary file), this complexity is confined to the computer and is invisible to clinical staff who can still collect NEWS in the usual way, without incurring any additional data collection tasks. So, cNEWS offers superior risk predictions which can be automatically updated without impeding clinical workflows.

In a previous study, we described how NEWS and blood test results can be combined to develop a risk prediction tool for sepsis31 and in-hospital mortality.32 However, the utility of this combination of clinical data as an initial risk prediction/screening tool is undermined because there is often a delay (of an hour or so) in obtaining the blood test results and up to one-fourth of patients do not have a complete set of blood results. In contrast, in the development of cNEWS, we excluded <5% of emergency admissions because of missing or delayed NEWS, which indicates that it may be possible to develop a blended approach to risk prediction, which commences with cNEWS and accrues additional variables, such as blood test results, as and when they become available.

There are several important limitations in our study which merit further work. (1) We focused on emergency medical admissions in two different hospitals with similar mortality rates, so this limits the extension of our findings to non-medical admissions, elective admissions and possibly to hospitals with different mortality profiles. (2) Reporting more accurate estimates of the patient’s risk of death alone are unlikely to lead to improvements in quality of care and better outcomes unless they lead to better and earlier clinical decision making, raise situational awareness and are followed-up by appropriate interventions—above and beyond that already associated with NEWS. Further empirical work is required to assess this. (3) Since cNEWS is based on NEWS and escalation protocols are based on NEWS (eg, thresholds of 5+ and 7+ for higher levels of care—see online supplementary file), further work is required to determine how to successfully blend the risk estimates from cNEWS into existing escalation protocol without unintended adverse consequences. (4) We used the index NEWS data in our cNEWS models, which reflect the ‘on-admission’ risk of mortality of the patient. Nonetheless, NEWS is repeatedly updated for each patient according to local hospital protocols, and the extent to which changes in NEWS over time reflect changes in mortality risk that need to be incorporated in our cNEWS models needs further study. Finally, we have now developed three automated risk equations that use NEWS to predict mortality. Further work is required to engineer these equations into hospital IT systems and to assess how they work together to enhance the quality and safety of care without adverse consequences.

Conclusions

An externally validated complex, computer-aided NEWS model (cNEWS) has a higher positive predictive value than a NEWS only model. Since cNEWS places no additional data collection burden on clinicians and is readily automated, it may now be carefully introduced and evaluated in hospitals that have eNEWS systems.

References

View Abstract

Footnotes

  • Twitter @dzrichar

  • Contributors DR and MM had the original idea for this work. MF undertook the statistical analyses with guidance from AS and MM. RH and KB extracted the necessary data frames. DR gave a clinical perspective. MF and MM wrote the first draft of this paper and all the authors subsequently assisted in redrafting and have approved the final version.

  • Funding This research was supported by the Health Foundation. The Health Foundation is an independent charity working to improve the quality of healthcare in the UK. This research was also supported by the National Institute for Health Research (NIHR) Yorkshire and Humberside Patient Safety Translational Research Centre (YHPSTRC).

  • Disclaimer The views expressed in this article are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval We obtained ethical approval for the main research project of which this is a sub-study from Yorkshire and The Humber—Leeds West Research Ethics Committee (reference number 15/YH/0348).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data may be obtained from a third party and are not publicly available.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.