Article Text
Abstract
Objectives Estimating mortality risk in hospitalised SARS-CoV-2+ patients may help with choosing level of care and discussions with patients. The Coronavirus Clinical Characterisation Consortium Mortality Score (4C Score) is a promising COVID-19 mortality risk model. We examined the association of risk factors with 30-day mortality in hospitalised, full-code SARS-CoV-2+ patients and investigated the discrimination and calibration of the 4C Score. This was a retrospective cohort study of SARS-CoV-2+ hospitalised patients within the RECOVER (REgistry of suspected COVID-19 in EmeRgency care) network.
Setting 99 emergency departments (EDs) across the USA.
Participants Patients ≥18 years old, positive for SARS-CoV-2 in the ED, and hospitalised.
Primary outcome Death within 30 days of the index visit. We performed logistic regression analysis, reporting multivariable risk ratios (MVRRs) and calculated the area under the ROC curve (AUROC) and mean prediction error for the original 4C Score and after dropping the C reactive protein (CRP) component.
Results Of 6802 hospitalised patients with COVID-19, 1149 (16.9%) died within 30 days. The 30-day mortality was increased with age 80+ years (MVRR=5.79, 95% CI 4.23 to 7.34); male sex (MVRR=1.17, 1.05 to 1.28); and nursing home/assisted living facility residence (MVRR=1.29, 1.1 to 1.48). The 4C Score had comparable discrimination in the RECOVER dataset compared with the original 4C validation dataset (AUROC: RECOVER 0.786 (95% CI 0.773 to 0.799), 4C validation 0.763 (95% CI 0.757 to 0.769). Score-specific mortalities in our sample were lower than in the 4C validation sample (mean prediction error 6.0%). Dropping the CRP component from the 4C Score did not substantially affect discrimination and 4C risk estimates were now close (mean prediction error 0.7%).
Conclusions We independently validated 4C Score as predicting risk of 30-day mortality in hospitalised SARS-CoV-2+ patients. We recommend dropping the CRP component of the score and using our recalibrated mortality risk estimates.
- COVID-19
- ACCIDENT & EMERGENCY MEDICINE
- Adult intensive & critical care
- EPIDEMIOLOGY
- GENERAL MEDICINE (see Internal Medicine)
Data availability statement
No data are available.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
- COVID-19
- ACCIDENT & EMERGENCY MEDICINE
- Adult intensive & critical care
- EPIDEMIOLOGY
- GENERAL MEDICINE (see Internal Medicine)
Strengths and limitations in this study
In this first study using a national US sample of patients who tested positive for SARS-CoV-2 and were hospitalised through emergency departments, our results confirmed the previous findings that older age, comorbidities, body mass index≥40 kg/m2, higher respiratory rate and lower oxygen saturation were associated with 30-day mortality.
We also observed that the arrival to the emergency care setting from a nursing home was associated with increased mortality.
We independently validated 4C Mortality Score as predicting risk of 30-day mortality in hospitalised SARS-CoV-2+ patients.
We recommend dropping the C reactive protein component of the score and using our recalibrated mortality risk estimates when estimating the 30-day mortality in hospitalised patients who test positive for SARS-CoV-2.
Introduction
The COVID-19 pandemic has placed tremendous strain on emergency and critical care resources in hospitals worldwide.1–3 To prepare the healthcare systems for the surges, several studies have developed prediction models to assess mortality risk in patients hospitalised with COVID-19. These studies identified the following risk factors for mortality or critical care admission: age, sex, comorbid conditions, laboratory values and vital signs.4–16
In a systematic review that evaluated many of these risk prediction studies using the prediction model risk of bias assessment tool (PROBAST), Wynants et al concluded that many of the current risk models may be misleading.10 However, the authors’ analysis suggested that one COVID-19 mortality prediction model, the Coronavirus Clinical Characterisation Consortium (4C) Mortality Score, which was built on a large UK dataset, had relatively low risk of bias in most domains by the PROBAST criteria. The 4C Mortality Score includes eight variables: age, sex, respiratory rate, oxygen saturation, number of comorbidities, level of consciousness, blood urea nitrogen and C reactive protein (CRP) (see table 1).17 While there has been continued interest in the development of prediction models for COVID-19, the 4C Mortality Score represented one of the first with a low risk of bias and therefore a good candidate for verification in other populations.
Point assignment for 4C Mortality Score
In this study, we investigated the risk of 30-day mortality in hospitalised SARS-CoV-2+ patients within the RECOVER (REgistry of suspected COVID-19 in EmeRgency care) network.18 In a large cohort of SARS-CoV-2+ patients hospitalised from 99 US emergency departments (EDs), we determined the relation of demographic and clinical factors with 30-day mortality and investigated the discrimination and calibration of the 4C Mortality Score with and without the CRP value.
Methods
In this retrospective cohort study, we included patient-level data from the RECOVER Network, a national registry of patients who were tested for SARS-CoV-2 during their ED visit. We restricted the analysis to full code status patients ≥18 years old who tested positive for SARS-CoV-2 and were hospitalised from the ED.18 The study was approved or deemed exempted by the Institutional Review Boards of all participating sites.
Data source
We obtained data from 40 medical centres representing 99 EDs from 27 US states and the District of Columbia. Data were collected using a REDCap data collection form that was distributed to the ED sites during the study period (March 2020–September 2020); our data were downloaded from the registry in December 2020. The REDCap form (online supplemental appendix A) had seven sections and 204 questions, which generated 360 data elements. Variables reflected a combination of routinely collected information (eg, patient demographics, medical history, vital signs and diagnostic test results), patient-reported symptoms and risk exposures, clinical outcomes (eg, admission, therapies, death) and those deemed important by the RECOVER Network steering committee. After creation, but prior to launch, the data form was piloted at 19 sites and refined. For additional section details and the questions, please refer to the data collection form in the online supplemental material. The data were obtained from the electronic healthcare record using a combination of electronic download for routinely collected, coded variables (eg, age, vital signs and laboratory values), supplemented by chart review by research personnel, using methods previously described.18
Supplemental material
Patient and public involvement
Patients were not involved in the development of our work, setting the research question or determining the outcome measures. This applies to both the RECOVER network and the work presented here. Given the nature and limitations of emergency care during the COVID-19 pandemic, it was not appropriate or possible to involve patients or the public in the design, or conduct, or reporting or dissemination plans of our work.
Study variables
We analysed patient characteristics such as demographics, vital signs, symptoms, risks for infection, comorbidities and medications. Following the 4C Mortality Score, we categorised the patients into five age groups (18–49, 50–59, 60–69, 70–79, 80+). The US standard ethnicity (Hispanic/Latinx yes/no) and race categories were combined into eight categories (Hispanic/Latinx (H/L), non-H/L white, non-H/L African American/black, non-H/L Asian, non-H/L Native American/Alaskan Native, non-H/L Pacific Islander/Native Hawaiian, non-H/L mixed and Unknown). In the analysis, we combined non-H/L Native American (0.2%), non-H/L Pacific Islander (0.2%), non-H/L mixed, other and unknown into a single group (12.8%).
All included patients had a positive reverse transcriptase PCR test (RT-PCR) test for SARS-CoV-2. Almost all the tests were performed during the ED visit, but we also included patients who had a test in a physician’s office or urgent care immediately prior to the ED visit. We excluded patients whose 30-day vital status could not be ascertained, those who died in the ED before vital signs were recorded and those who did not have full code status.
Study outcome
The primary outcome was death within 30 days of the index visit. The 4C Mortality Score’s predictive accuracy was measured by the area under the ROC curve (AUROC) and mean prediction error.
Data preparation and statistical analysis
For comparison with the other cohorts, we report the median and IQR of continuous variables—both in the entire cohort and in the subgroup who died within 30 days—and compared the median values using the rank sum test. We performed univariable analysis on 26 independent variables that were included on the data collection form using complete (non-missing) data and reporting risk ratios for 30-day mortality. For risk ratio reporting of continuous variables, we chose category boundaries based on the 4C Mortality Score (age, respiratory rate, oxygen saturation, blood urea nitrogen (BUN) and CRP) or other published mortality prediction models (body mass index (BMI), creatinine, total bilirubin).19
We selected variables for our multivariable logistic regression model based on the 4C Mortality Score, other prior studies and clinical judgement. The RECOVER dataset has complete data (<1.5% missing) on most variables, with the exception of CRP, BMI, bilirubin and smoking status. For our multivariable analysis, we used imputed values for missing data using Stata’s implementation of the data augmentation algorithm.20 We report multivariable risk ratios with 95% CIs. Statistical analysis was performed using SAS Enterprise Guide V.8.3 and Stata/SE V.16.1.21
We replicated the 4C Mortality Score described by Knight et al with one modification.17 Since we did not have a variable for Glasgow Coma Score or confusion on examination, we used the symptom ‘altered mental status or confusion’ instead. In addition to the full score, we tested a modified score dropping CRP, which was missing in 39% of the records. We evaluated discrimination and calibration using nine score categories available from Knight et al (0–2, 3–4, 5–6, 7–8, 9–10, 11–12, 13–14, 15–16, ≥17). We used the mortality reported in the 4C validation dataset as our predicted risks for comparison with observed mortality. For reporting, we pooled the results into the four risk groups defined by Knight et al: Low: 0–3; Intermediate: 4–8; High: 9–14; and Very High ≥15. The AUROC was calculated with 95% CI using the DeLong method.22 Calibration was assessed using a standard calibration table, mean prediction error and the square root of both the calibration error and the Brier Score. We also used a modified Bland-Altman-style calibration plot.23
Results
Of 26 914 patients in the first version of the registry, 6822 met the inclusion criteria for this analysis (≥18 years old, SARS-CoV-2+, hospitalised from the ED, full code status). We excluded 11 who were missing vital status at 30 days and 9 who died in the ED prior to vital signs, leaving 6802 in the analysis cohort. Of the 6802, 1149 (16.9%) died within 30 days. The median age of patients in the cohort was 64 years (IQR 52–75); 56.2% were male; and 61.4% had at least one comorbid condition. Of note, the median oxygen saturation was 92% (IQR 87%–95%) overall and 86% (IQR 76%–93%) in those who died (p<0.0001) (table 2).
Patient characteristics, clinical characteristics and 30-day mortality of SARS-CoV-2+ patients hospitalised from the emergency department
Of the demographic risk factors, age group, male sex and residence in a nursing home/assisted living facility were the principal mortality predictors (table 3). In the multivariable model, age 80+ years increased 30-day mortality risk by a factor of 5.79 (95% CI 4.23 to 7.34); male sex increased it by 1.17 (95% CI 1.05 to 1.28); and nursing home/assisted living facility residence increased it by 1.29 (95% CI 1.1 to 1.48). On a univariable basis, Hispanic ethnicity and smoking status were associated with lower mortality risk, but after adjusting for other variables, including the younger age of Hispanics and smokers, the risk ratio for mortality for Hispanic ethnicity was 0.96 (95% CI 0.82 to 1.1) and for smoking was 1.02 (95% CI 0.82 to 1.23).
Effect of patient and clinical characteristics on 30-day mortality of SARS-CoV-2+ patients hospitalised from the emergency department
In the univariable analysis, extreme obesity (BMI≥40) did not increase risk, but after adjustment for age, sex and other comorbidities, the risk ratio for BMI≥40 was 1.44 (95% CI 1.23 to 1.64). In addition to obesity, the multivariable analysis (table 3) showed that other comorbidities associated with increased risk of death were liver disease as indicated by a total bilirubin ≥2.0 mg/dL and kidney disease as indicated by a creatinine ≥1.2 mg/dL or BUN ≥40. Asthma and diabetes were not significant risk factors. Patients who arrived from a nursing home had an increased risk of mortality (risk ratio 1.29, 95% CI 1.1 to 1.48).
Table 3 also shows that increase in respiratory rate, decrease in oxygen saturation and increase in CRP each corresponded with an increase in mortality.
Compared with the 4C validation dataset from Knight et al, the mean 4C Mortality Scores were lower in our dataset (mean score 9.0 vs 10.6) (figure 1A). The AUROC from the RECOVER dataset was comparable to that of the original 4C validation dataset. Using nine 4C score categories, the AUROC from the RECOVER dataset was not substantially different than the AUROC from the original 4C validation dataset (AUROC: RECOVER 0.786 (95% CI 0.773 to 0.799) versus 4C validation 0.763 (95% CI 0.757 to 0.769) (figure 1B). Our observed category-specific mortalities were lower than those in the 4C validation dataset. Using the mortalities from the 4C validation dataset would have overestimated risk by 6.0% on average (Mean prediction error 6.0%; √Calibration error 0.066; and √Brier Score 0.350) (table 4, figure 1C).
Comparison of 4C (Coronavirus Clinical Characterisation Consortium) validation and RECOVER datasets. (A) 4C Mortality Scores were lower in the RECOVER dataset than in the original 4C validation dataset. (B) ROC curves for the 4C Mortality Score (categorised into the nine ranges from (A)) in the 4C validation dataset and the RECOVER dataset. (C) Calibration plot (modified Bland-Altman) showing prediction error versus observed mortality for the 4C Mortality Score with and without the C reactive protein (CRP) component. Points from left to right are in the 4C Mortality Score ranges shown in figure (A) from left to right. AUROC, area under the ROC curve; RECOVER, REgistry of suspected COVID-19 in EmeRgency care.
Comparison of observed mortality by 4C mortality risk group for recover dataset of SARS-CoV-2+ patients hospitalised from the emergency department
Dropping CRP from the 4C Mortality Score reduced the scores overall (mean score 7.7) but did not substantially change discrimination (AUROC 0.776, 95% CI 0.762 to 0.790). Dropping the CRP component did affect calibration. The category-specific mortalities in our dataset were now close to those in the 4C validation dataset. Using the mortalities from the 4C validation dataset would have misestimated risk by 0.7% on average (Mean prediction error 0.7%, √Calibration error 0.017 and √Brier Score 0.346).
Discussion
In this analysis of multicentre data from the RECOVER network, our results confirmed several previous findings for risk factors for COVID-19 mortality, including older age, comorbidities, BMI≥40 kg/m2, higher respiratory rate and lower oxygen saturation.4–9 11–14 In addition, as reported by Graselli et al in critically ill patients, we observed that male sex is predictive of mortality.7 We also observed the expected, but previously unquantified finding that arrival to the emergency care setting from a nursing home was associated with increased mortality. While this has not been specifically mentioned in other studies, Ferrando-Vivas et al found that functional dependence was related to mortality (HR 1.425).5
In the RECOVER network, COVID-19 positivity was higher among Hispanic patients when compared with non-Hispanics, but the adjusted mortality among hospitalised Hispanic patients is similar to hospitalised non-Hispanic whites.24 Similarly, Mackey et al reported that hospitalisations for COVID-19 among those who identify their ethnicity as Hispanic were proportionately higher than for their non-Hispanic white counterparts but the case fatality rate was similar between Hispanic and non-Hispanic patients.25
We also found that the comorbid conditions such as liver disease defined as elevated total bilirubin ≥2.0 and kidney disease defined as creatinine ≥1.2 mg/dL or BUN ≥40 had an independent association with 30-day mortality in hospitalised SARS-CoV-2+ patients. Surprisingly, previous studies and our results did not establish diabetes as a significant risk factor.26–28 Our findings on the association of smoking with 30-day mortality did not concur with previous studies. Smoking and cumulative smoking exposure were predictive of mortality in previous studies,26 but we did not find a statistically significant association after controlling for other variables. Finally, among the clinical variables, tachypnoea (respiratory rate ≥20) and hypoxaemia (oxygen saturation <92%) were significant predictors of mortality. Zhao et al reported higher odds of mortality (adjusted OR 4.8) for an oxygen saturation <92%.13
Given the multiplicity of variables associated with 30-day mortality, clinicians need a simple score to better predict short-term mortality. The 4C Mortality Score is one such score and it performed well in our dataset. Discrimination was excellent, and calibration was also good, although using the category-specific mortalities from the 4C validation dataset would have overestimated risk. CRP was missing in 39% of the records in our study, so we examined the performance of the 4C Mortality Score without the CRP component. Discrimination remained good, and the category-specific risks from the 4C validation were accurate. When CRP was removed from the score, many patients with high CRP values moved into a lower risk category. Those patients who remained with high 4C Mortality Scores despite removal of CRP died at a higher rate than those whose risk score decreased, but those with high CRP values who moved to a lower risk group had higher mortality than the average for their new lower risk group. This might be referred to as stage migration effect. When the high CRP patients moved from the very-high-risk group to the high-risk group, the average mortality went up in both groups. Based on our observations, we suggest using the 4C Mortality Score without the CRP component, but recalibrating risk estimates as per our table 4 or online supplemental table A. Using category-specific risks as opposed to the four risk groups (low, intermediate, high, very high) is preferred because it does not assume the distribution across the risk groups is the same in different populations. This modified 4C Mortality Score could assist with triage decisions, to inform patients and their family members of prognostic information, and to help with forecasting of resource utilisation in the hospital.
The nature of the COVID-19 pandemic greatly accelerated the timeline of related research and has resulted in rapid changes to practice patterns and patient presentation. At the time of this study, the 4C Mortality Score was among the most promising risk evaluation tools and had been identified as having a low likelihood of bias. Since the inception of our study to validate this score, many other systems have been proposed. These have been developed in a variety of different patient populations using a wide range of methods.27–35 Some models have been independently assessed and performance varies.36 Updates to a systematic review of prediction models continue to identify the prognostic 4C Mortality Score as among the most promising,37 suggesting that attempts to validate and calibrate this and other existing risk estimation models could aid providers in the evaluation of the many available scoring systems for patients with COVID-19 disease.
Limitations
This is a national study of hospitalised SARS-CoV-2+ patients. The large sample size, the number and diversity of the participating sites, and a comprehensive list of data elements are major strengths of this study. However, some sites contributed more SARS-CoV-2+ patients than others. We did see regional differences in 30-day mortality, but these did not affect the risk ratios. As noted above, CRP was missing in almost 39% of patients.
Additional limitations are related to the nature of the COVID-19 pandemic and the changes in patient population and clinical practices that have occurred over time. The data in this study represent a time period early in the pandemic (on or before September 2020) and thus may not fully account for practice changes. However, these data align with the time period of the RECOVERY trial, which introduced the main practice change affecting mortality (use of dexamethasone) in February 2021.38
Conclusions
We conclude that among SARS-CoV-2+ hospitalised patients, older patients with comorbid conditions and those with hypoxaemia at the time of presentation have a very high risk of dying within 30 days. We independently validate the 4C Mortality Score as predicting risk of death in hospitalised SARS-CoV-2+ patients, but we recommend dropping the CRP component of the score and using our recalibrated mortality risk estimates.
Data availability statement
No data are available.
Ethics statements
Patient consent for publication
Ethics approval
The RECOVER registry protocol was reviewed by the institutional review boards (IRBs) at all sites; three IRBs provided approval with waiver of informed consent. All others provided an exemption from human subjects designation. All data were anonymised prior to analysis (local IRB: 56234). Participants gave informed consent to participate in the study before taking part.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
Twitter @ajunegordon
Collaborators The RECOVER Investigators Group: Thomas Aufderheide, MD, Medical College of Wisconsin, Milwaukee, WI. Joshua Baugh, MD, Massachusetts General Hospital, Boston, MA, USA. David Beiser, MD, University of Chicago School of Medicine, Chicago, IL, USA. Joseph Bledsoe, MD, Healthcare Delivery Institute, Intermountain Healthcare, Salt Lake City, UT, USA. Edward Castillo, PhD, University of California at San Diego, San Diego, CA, USA. Makini Chisholm-Staker, MD, Mt. Sinai School of Medicine, New York, NY, USA. D. Mark, Courtney, MD, University of Texas Southwestern, Dallas, TX, USA. Elizabeth Goldberg, MD, Warren Alpert Medical School of Brown University, Providence, RI, USA. Hans House, MD, University of Iowa School of Medicine, Iowa City, IA, USA. Stacey House, MD, Washington University School of Medicine, St. Louise, MO, USA. Timothy Jang, MD, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA. Christopher Kabrhel, MD, Massachusetts General Hospital, Boston, MA, USA. Stephen Lim, MD, Louisiana State University School of Medicine, New Orleans, New Orleans, LA, USA. Troy Madsen, MD, University of Utah School of Medicine, Salt Lake City, UT, USA. Danielle McCarthy, MD, Northwestern University, Feinberg School of Medicine, Chicago, IL, USA. Andrew Meltzer, MD, George Washington University School of Medicine, Washington, DC, USA. Stephen Moore, MD, Penn State Milton S. Hershey Medical Center, Hershey, PA, USA. Mark B. Mycyk, MD, Cook County Hospitals, Chicago, IL, USA. Craig Newgard, MD, Oregon Health and Science University, Portland, OR, USA. Kristen E. Nordenholz, MD, University of Colorado School of Medicine, Aurora, CO, USA. Justine Pagenhardt, MD, West Virginia University School of Medicine, Morgantown, WV, USA. Ithan Peltan, MD, Intermountain Healthcare, Salt Lake City, UT, USA. Katherine L. Pettit, MD, Indiana University School of Medicine, Indianapolis, IN, USA. Michael Pulia, MD, USA. University of Wisconsin School of Medicine and Public Health, Madison, WI, USA. Michael Puskarich, MD, Hennepin County Medical Center and the University of Minnesota, Minneapolis, MN, USA. Lauren Southerland, MD, Ohio State University Medical Center, Columbus, OH, USA. Scott Sparks, MD, Riverside Regional Medical Center, Newport News, VA, USA. Danielle Turner-Lawrence, MD, Beaumont Health, Royal Oak, MI, USA. Marie Vrablik, MD, University of Washington School of Medicine, Seattle, WA, USA. Alfred Wang, MD, Indiana University School of Medicine, Indianapolis, IN, USA. Anthony Weekes, MD, Atrium Health, Charlotte, NC, USA. Lauren Westafer, MD, Baystate Health, Springfield, MA, USA. John Wilburn, MD, Wayne State University School of Medicine, Detroit, MI, USA.
Contributors PG, MAK, AJG, CB, CC and JK conceived and designed the study, JK obtained research funding, designed and organised the registry. MAK and LM provided statistical advice on study design and analysed the data. AJG, MAK and PG drafted the manuscript, and all authors contributed substantially to its revision. PG takes responsibility for the paper as a whole and acts as guarantor.
Funding This study was funded by Department of Emergency Medicine, Indiana University School of Medicine. No Grant Number.
Competing interests None declared.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting or dissemination plans of this research.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.