Objectives Current mortality prediction models used in the intensive care unit (ICU) have a limited role for specific diseases such as influenza, and we aimed to establish an explainable machine learning (ML) model for predicting mortality in critically ill influenza patients using a real-world severe influenza data set.
Study design A cross-sectional retrospective multicentre study in Taiwan
Setting Eight medical centres in Taiwan.
Participants A total of 336 patients requiring ICU-admission for virology-proven influenza at eight hospitals during an influenza epidemic between October 2015 and March 2016.
Primary and secondary outcome measures We employed extreme gradient boosting (XGBoost) to establish the prediction model, compared the performance with logistic regression (LR) and random forest (RF), demonstrated the feature importance categorised by clinical domains, and used SHapley Additive exPlanations (SHAP) for visualised interpretation.
Results The data set contained 76 features of the 336 patients with severe influenza. The severity was apparently high, as shown by the high Acute Physiology and Chronic Health Evaluation II score (22, 17 to 29) and pneumonia severity index score (118, 88 to 151). XGBoost model (area under the curve (AUC): 0.842; 95% CI 0.749 to 0.928) outperformed RF (AUC: 0.809; 95% CI 0.629 to 0.891) and LR (AUC: 0.701; 95% CI 0.573 to 0.825) for predicting 30-day mortality. To give clinicians an intuitive understanding of feature exploitation, we stratified features by the clinical domain. The cumulative feature importance in the fluid balance domain, ventilation domain, laboratory data domain, demographic and symptom domain, management domain and severity score domain was 0.253, 0.113, 0.177, 0.140, 0.152 and 0.165, respectively. We further used SHAP plots to illustrate associations between features and 30-day mortality in critically ill influenza patients.
Conclusions We used a real-world data set and applied an ML approach, mainly XGBoost, to establish a practical and explainable mortality prediction model in critically ill influenza patients.
- adult intensive & critical care
- information technology
- infectious diseases & infestations
- adult intensive & critical care
- thoracic medicine
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
C-AH and C-MC contributed equally.
Contributors Study concept and design: CA-H, CM-C, Y-CF, S-JL, H-CW, W-FF, C-CS, W-CP, K-YY, K-CK, C-LW, C-ST, M-YL and W-CC. Acquisition of data: Y-CF, CM-C and W-CC. Analysis and interpretation of data: W-CC, Y-CF, CM-C, CA-H and M-YL. Drafting the manuscript: W-CC.
Funding This study was supported in part by grants from Veterans General Hospitals and the University System of Taiwan Joint Research Program (VGHUST108-G2-4-2). The funders had no role in the study design, data collection and analysis, decision to publish or preparation of the manuscript.
Competing interests None declared.
Patient consent for publication Not required.
Ethics approval Taichung Veterans General Hospital CE16093A, National Taiwan University Hospital 201605036RIND, Taipei Veterans General Hospital 2016-05-020CC, Tri-Service General Hospital 1-105-05-086, Chang Gung Memorial Hospital 201600988B0, China Medical University Hospital 105-REC2-053(FR), Kaohsiung Medical University Hospital KUMHIRB-E(I)-20170097, Kaohsiung Chang Gung Memorial Hospital 201600988B0.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement All data relevant to the study are included in the article or uploaded as supplementary information. All of the data and materials are provided in the manuscript and the supplemental data. The data set has been put in public Github, and is available via https://github.com/GitEricLin/BMJOpen/.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.