Introduction About 42 million surgeries are performed annually in the USA. While the postoperative mortality is less than 2%, 12% of all patients in the high-risk surgery group account for 80% of postoperative deaths. New onset of haemodynamic instability is common in surgical patients and its delayed treatment leads to increased morbidity and mortality. The goal of this proposal is to develop, validate and test real-time intraoperative risk prediction tools based on clinical data and high-fidelity physiological waveforms to predict haemodynamic instability during surgery.
Methods and analysis We will initiate our work using an existing annotated intraoperative database from the University of California Irvine, including clinical and high-fidelity waveform data. These data will be used for the training and development of the machine learning model (Carnegie Mellon University) that will then be tested on prospectively collected database (University of California Los Angeles). Simultaneously, we will use existing knowledge of haemodynamic instability patterns derived from our intensive care unit cohorts, medical information mart for intensive care II data, University of California Irvine data and animal studies to create smart alarms and graphical user interface for a clinical decision support. Using machine learning, we will extract a core dataset, which characterises the signatures of normal intraoperative variability, various haemodynamic instability aetiologies and variable responses to resuscitation. We will then employ clinician-driven iterative design to create a clinical decision support user interface, and evaluate its effect in simulated high-risk surgeries.
Ethics and dissemination We will publish the results in a peer-reviewed publication and will present this work at professional conferences for the anaesthesiology and computer science communities. Patient-level data will be made available within 6 months after publication of the primary manuscript. The study has been approved by University of California, Los Angeles Institutional review board. (IRB #19–0 00 354).
- Machine learning
- blood pressure
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
To impact surgical practice by better and earlier identifying patients at greatest risk for cardiorespiratory instability and by providing point-of-care data-driven explanations and process-specific resuscitation using real-time data input and point-of-care management, potentially decreasing preventable surgical morbidity and mortality.
Our deliverables will create a sensitive and specific means to predict which patients may or may not ever develop cardiorespiratory instability, which has important implications for patient safety, surveillance, triage and care.
The modelling methods we propose to develop should drastically shift intraoperative paradigms and treatment protocols and they will be applicable to existing monitoring modalities.
Our machine learning analytics have potential to identify new monitoring parameters to improve prediction of instability and, in our exploratory specific aim, to reverse engineer our understanding of disease aetiology during the surgery.
Although we will focus on high-risk surgery patients, we will collect data on the whole surgical population such that we will also develop a novel database on low-risk patients but the main limitation of this study is that it is a retrospective study.
About 42 million surgeries are performed annually in the USA.1 2 While the estimated postoperative mortality is less than 2%, 12% of all patients in the high-risk surgery group account for 80% of postoperative deaths.3 4 To assist in guiding clinical decisions and prioritisation of care, several perioperative clinical risk scores have been proposed.5–7 The goal of these scores is to help planning clinical management and allocating resources to avoid postoperative complication and death. Recently, US hospitals have adopted electronic health report (EHR) documentation of patient care. Still, interoperability of these EHR systems remains an open issue, leading to challenges in data integration. As a result, the potential that hospital data offer in terms of understanding and improving care has not been realised. While physiological prediction tools have been developed in the critical care setting,8–11 the goal of this proposal is to develop, validate and test real-time intraoperative risk prediction tools based on EHR data and high-fidelity physiological waveforms to predict cardiorespiratory instability (CRI) in the perioperative/surgical setting. New onset of CRI is common in patients undergoing surgery. Delayed treatment of CRI leads to increased morbidity, mortality and use of resources in surgical patients. Even short periods of intraoperative hypotension (mean arterial pressure <65 mm Hg) have been linked to postoperative complications such as myocardial infarction and kidney failure12 13 and mortality.14 Although anaesthesiologists can rescue patients with CRI—decreasing the incidence of cardiac arrests and inpatient mortality15–17—a more proactive approach would be to enable anaesthesiologists and nurse anaesthetists to recognise impending CRI before it happens.
Specific aim 1 (SA1). Development of the machine learning (ML) test set (University of California, Irvine (UCI)) and retrospective validation (UC Los Angles (UCLA)) in high-risk surgery patients to identify subsequent CRI. Using ML analytics, we will extract a core dataset which best characterises the signatures of normal intraoperative variability, various intraoperative CRI aetiologies and variable responses to intraoperative resuscitation using (1) existing high-granularity physiological and clinical record data to define the structure of the database; (2) high-granularity intraoperative data from high-risk surgery patients for prospective input and further model development. Since our approaches use agnostic prioritisation of physiological signals, we will explore which processes underlie specific signatures. This reverse-engineering approach will give insight into cardiorespiratory homoeostasis and intraoperative CRI.
SA2. Prospective validation of a clinical decision support system (CDS) tool. We will employ clinician-driven iterative design to create a novel CDS user interface, and evaluate its effect in simulated intraoperative high-risk surgeries.
Methods and analysis
We propose to initiate our work using an already existing annotated intraoperative database (UCI) including EHR and high-fidelity waveform data. These data will be used for the initial training and development of the ML model that will then be tested on prospectively collected UCLA database (SA1). Simultaneously, we will use existing knowledge of CRI patterns derived from our step down and intensive care unit (SDU/ICU) cohorts, medical information mart for intensive care II data, UCI data and animal studies (University of Pittsburgh) to create smart alarms and graphical user interface (GUI) for CDS (SA2).
Patient population and data acquisition
The UCI dataset is based on the Bernoulli data collection system (Cardiopulmonary, New Haven, Connecticut, USA), and as of February 2019 it includes high-fidelity physiological waveforms and EHR18 data on more than 35 000 patients. All waveform and clinical data from surgical patients at UCI have been collected for research purposes since November 2015. The total UCI data collection as of February 2019 consists of >1 20 000 monitoring hours of waveform and clinical data or >4000 GBs of data. All waveform data is collected off of surgical patient monitors with the Bernoulli software and equipment, and de-identified and synced retrospectively with clinical data (figure 1). The sampling rates for EKG, plethysmographic and arterial waveforms are 300, 100 and 120 Hz. Clinical data are extracted from intraoperative EHR (Surgical Information Systems (Alpharetta, Georgia, USA) and Epic (Verona, Wisconsin, USA)) and synced with waveform data. These data are then linked to monitoring and clinical annotations, where adverse events are documented. At UCLA, we have established a perioperative data warehouse including all the EHR data and we plan to instal an intraoperative data collection system similar to Bernoulli for waveform collection. The EHR data at UCLA have already been analysed as proof of concepts.18 19 The Bernoulli and similar software (Bedmaster) provides a mission-critical application layer designed for multiparameter data abstraction, fusion, remediation, time synchronisation and real-time processing. Bernoulli provides an extensive distribution layer designed for export to third-party applications and EHR providers via HL7 or custom protocols. To create any predictive analytics, one must have a dataset free of significant artefacts. Alert artefacts greatly reduce accuracy of predictive models, which may misinform therapy and undermine response. Having experts manually annotate large amounts of data to identify all artefacts is impractical. We developed an active ML approach to identify real vital sign alerts from artefact. We found that increasing the amount of adjudicated training data improves accuracy of alert identification, but just 30% of labelled events are sufficient to confidently identify 77%±11% of all the remaining artifacts. We have also developed automated algorithms to automatically extract accurate clinical information from the EHR. In one project, we trained an algorithm to automatically extract duration of postoperative mechanical ventilation from the EHR after cardiac surgery. By incorporating three different data sources into our algorithm and by using preprogrammed clinical judgement to overcome common errors with data entry, our results proved to be more comprehensive, more accurate and required a fraction of the computation time compared with manual review. We will use these approaches to reduce the need for human expert annotation of monitoring events and to make our proposed scale data review realistic and manageable.
Identification of potentially informative biosignatures. In support of SA1, we plan to maximise the chances for ML-developed feature extraction methods (time-window-gated statistics such as averages, variances, entropies and trends of univariate time series, cross-correlations between pairs of series or spectral decompositions, followed by compression with, eg, principal component analysis (PCA)) with massive-scale comprehensive searches for change points in signal to add information useful in detecting and characterising CRI as a dichotomous variable. The definition of CRI is defined both by exceedances of vital sign parameters (heart rate (HR), RR, blood pressure (BP), SpO2) beyond thresholds, EHR-defined resuscitative actions such as bolus of fluids, infusion of vasopressors and inotropes. Initially, we will strive for 15 min advanced warning for volume loss, 15 min for haemodynamically unstable arrhythmia, 10 min for postabdominal insufflation haemodynamic instability and 5 min for postanaesthesia induction hypotension, prior to overt clinical signs and symptoms of CRI.
Some of these approaches have been used in the ICU setting to predict sepsis but have never been used in the perioperative/surgical setting.8–11 We have successfully applied such methods to problems involving large amounts of multivariate time series data.20 Next, an ML algorithm uses a sample of annotated training data to identify empirically a subset of those change points that bring predictive value to models of CRI (supervised learning). In addition, we will take our quest for precursors of CRI outside of the usual single-signal or joint multisignal modelling, by learning structures of multivariate correlations between pairs of signals, and tracking them over time. A novel use of Canonical correlation analysis (CCA) will be the first approach to try in this context, and we will extend it to adopt temporal regularisation constraints to enable smooth transitions between consecutive time frames. Our novel approach will let us discover pairs of combinations of features, extracted independently from two versus time series that highly correlate with each other. CCA identifies pairs of such ‘principal components’ that could be learnt from the same null space data as shown in the context of the PCA-based approach. If certain pairs of components are found to correlate consistently during stability but lose correlation at or before the onset of a CRI (or vice versa), they boost detectability of our target events. We will perform extensive searches for such CCA components and add them to the pool of factors worth tracking. We will use regularisation and feature ablation to maintain parsimony of the resulting models, and to identify features of data that have key contributory effects on performance. We expect to identify such features, which would not be visible to alternatives (either using many independent single-stream models or a single (fully combined) joint model for null-space PCA-like modelling of baseline variance, or via bivariate cross-correlations of pairs of time series).
Learning predictive models of instability
We will use predictive ML models trained on the UCI annotated data to empirically identify the parsimonious set of predictors of the targeted pathologies, selected from features extracted as described above. In one instance of our preliminary work, primary observables (eg, EKG signals, BP, central venous and pulmonary arterial pressure, and pulse oximetry waveforms) are processed to generate beat-to-beat HR, as well as the various diastolic and systolic pressure and oximetry signals. This feature become input for random forest regression model that learns to predict the time since start of bleed (equivalent to the amount of blood lost given fixed bleed rate in the referred experiment), and in the process identifies a subset of all features that jointly yields optimal performance. We also create 1 and 5 min moving averages of these derived variables, as well as often reported measures of HR variability along several domains (time and frequency domains, non-linear measures).21 The time windows determining the features from the raw electrical signal and output classes will vary according to aims. In SA1, we will enrich the development set to include a significant proportion of time windows from patients in CRI, thus, the model will predict either one of the CRI states, or absence of CRI. In SA1, models predictive of fluid resuscitation will include patients in CRI and the output (fluid responsiveness) is binary. We will use the activity monitoring operating characteristic analysis as described above to validate models’ sensitivity and specificity. To take our CDS beyond capability to detect intraoperative bleeding at a single specific rate, we will collect human data, retrospectively annotated by OR and ICU physicians for CRI events and apply Bayesian aggregation (BA) method.22 It accumulates evidence from subsequent measurements and tracks multiple hypotheses as their posterior probability distribution evolves while new evidence becomes available. BA will be our foundational approach to characterisation of detected and predicted events of interest whenever low signal to noise ratios in the available data would yield more direct regressive models impractical. In particular, it will detect functional hypovolaemia, and estimate the rate of intravascular fluid loss.
For our exploratory aim, we will perform explanatory analysis of the learnt predictive model asking what physiological control qualities of the primary predictors of the models best explain their predictive power. In this way, we will explore foundations of autoregulation, adaptation and failure in the context of CRI. In ML experiments, we will favour the types of models that avail explainability of generated results. We had success applying random forest classifiers and regressors in applications that require high level of user interaction and extensive explanations of predictions made.23 We have also tried our RIPR algorithm24 to support interactive adjudication of alerts as either true episodes of instability or artifacts.25 It relies on point estimators for conditional entropy and recovers a desirably small set of projections of data which accurately classify test alerts while remaining intuitive to humans. New alerts can be adjudicated using one of the projections from the retrieved set. We will extend this approach to systematically include semisupervised and active learning concepts to support semiautomated annotation of large-scale data sets given limited availability of qualified human experts. We have also shown combined predictive and explanatory utility of learning temporal association rules from asynchronous sequences of discrete events and continuous signals with TITARL algorithm. We have shown that it can be used to identify which of the potentially large number of patterns detected in data coincide or precede particular events of interest, and present the results in a readable and interpretable form of manageably simple logical statements. We will extend this approach so that the most predictive combinations of patterns and states that can asynchronously appear in multivariate clinical data, irrespective of the temporal resolution of their observation (waveforms, beat to beat, breath to breath, disparate clinical records and demographic data) or missingness of data (frequent in haemodynamic monitoring of human patients), will be revealed, validated by expert clinicians, and used to support predictive models.
Creation of a prototype CDS system bedside GUI showing real-time probability of impending instability, lead time to event and determining factors (SA2)
Primary model development and initial simulation testing will be done using the initial UCI and UCLA databases, and then validated at UCLA simulator centre. Then the UCLA and University of Pittsburgh datasets will serve as an external validation set for our predictive models in an iterative fashion (we will conduct external validation in years 3–5 of our project and will use area under the curve, sensitivity, specificity, positive predictive value and negative predictive value as well as precision recall to evaluate the performance of the model). First to predict CRI, then to diagnose specific aetiologies, and then to guide resuscitation based on our explanatory analyses and CDS development. Thus, our primary goal is to create a robust and generalisable approach that will readily expand across other healthcare information systems. We will finalise the ML analysis of the prospectively acquired high-risk surgery patient cohort dataset in year 5. We anticipate creating final prototype GUI incorporating both measures of present instability and our predictive model with descriptions as to the potential processes that will be causing instability. We will use clinician-driven iterative design to refine the GUI.
We will enlist 12 experts (6 nurse anaesthetists and 6 physician anaesthesiologists) to provide feedback. The high-fidelity simulator will be set to simulate the OR and using the prototype GUIs we have and will develop. The feedback sessions will be conducted in two groups of six clinicians to maximise variety in the input while benefiting from group dynamics. We will seek feedback regarding GUI (1) completeness or redundancy of content, (2) ease of interpretation and (3) ergonomics. This feedback evaluation will be iterative, with a new generation of the GUI, which takes into consideration input from each group, to be rolled out sequentially. We anticipate 3–5 GUI iterations before subsequent larger group testing.
We will move to larger scale clinician-driven iterative design through evaluation of simulated clinical GUI use. In this stage, the HFHS will be set as an operating room and based on 10 scenarios. The patient’s monitors will live stream the data collected in C.1 for 10 selected patient cases; clinicians will also have access to the other deidentified case data such as medical history, laboratory and diagnostic test results, current medications. Of the 10 simulated cases, eight will show evidence of instability across a 2-hour interval. Two for volume loss, two for haemodynamically unstable arrhythmia, two for postabdominal insufflation haemodynamic instability, two for postanaesthesia induction hypotension, while two will remain stable with half the patients monitored according to current standard practice, while half will use the additional GUI. Each scenario will be run by one anaesthesiologist and subsequently by one nurse anaesthetist. The clinicians will document patient assessment and care directives. We will then evaluate each scenario for effectiveness based on accuracy of diagnosis, time to correct diagnosis, accuracy of intervention choices (based on predefined scenarios and proposed correct interventions based on expert development) and time to intervention. These scenarios will be repeated four times, using different clinicians (ie, total of 40 clinicians, half MD and half nurses). All clinicians will be debriefed to gain their feedback as in SimStage 1.
Then we conduct a midterm analysis of effectiveness results that we will share with our original team of experts from SimStage 1, and make iterative design adjustments to the GUI, the alerting system or the action algorithms based on SimStage 2. We will then proceed to SimStage 3, another simulated trial with the same simulated settings, scenarios and equipment, but 40 different volunteers (so as to avoid Hawthorne effect). We will collect the same information, and evaluate if the effectiveness indicators from SimStage 2 are improved on in SimStage 3. The final refined design represents the deliverable CDS prototype for this project, and to estimate effect sizes for a future clinical trial.
Patients and public involvement
The choice of the outcome was guided by the recent large studies showing the relationship between intraoperative haemodynamic instability and postoperative outcome. Patients were not involved in the design of the study and will not be involved in the recruitment of study participants. The results of this study will be disseminated via publication in the medical journals.
Implications and future directions
If one could accurately predict who, when and why patients develop CRI during surgery, then effective pre-emptive treatments could be given to improve postoperative outcome and more effectively use healthcare resources. But signs of shock often occur late once organ injury is already present. The goal of this proposal is to develop, validate and test real-time intraoperative risk prediction tools based on EHR data and high-fidelity physiological waveforms to predict CRI and make the databases of intraoperative data and waveforms used for these developments freely accessible. This is extremely relevant because although 5.7 million Americans are admitted to an ICU in 1 year, more than 42 million undergo surgery annually. Previous and ongoing studies conducted in the ICU and in the step-down unit have built the architecture to collect real-time high-fidelity physiological waveform data streams and integrate them with patient demographics from the EHR to build large data sets, and derive actionable fused parameters based on ML analytics as well as display information in real time at the bedside to drive CDS in the critical care setting. The goal of this proposal is to apply these ML approaches to the complex and time compressed environment of high-risk surgery where greater patient and disease variability exist and shorter period of time is available to deliver truly personalised medicine approaches.
Strengths and limitations
We will leverage our previous work and NIH/R01-funded projects in the SDU/ICU using similar methodologies to characterise CRI during surgery (R01-GM117622 and R01-NR013912). Our innovations are: (1) To impact surgical practice by better and earlier identifying patients at greatest risk for CRI and by providing point-of-care data-driven explanations and process-specific resuscitation using real-time data input and point-of-care management, potentially decreasing preventable surgical morbidity and mortality; (2) Being able to adjust for placement, level of monitoring needed, and pre-emptive therapies and response to therapy enabling personalised medicine. It will create a sensitive and specific means to predict which patients may or may not ever develop CRI, which has important implications for patient safety, surveillance, triage and care: needed frequency of monitoring, case load mixture, workload, staff allocation, patient triage to monitored or non-monitored units, higher cost versus lower cost bed allocation, prevention of adverse events; (3) Few innovations have been introduced to improve technological patient surveillance and management in decades. The modelling methods we propose to develop should drastically shift intraoperative paradigms and treatment protocols and they will be applicable to existing monitoring modalities; (4) Our ML analytics also have potential to identify new monitoring parameters to improve prediction of instability and, in our exploratory SA, to reverse engineer our understanding of disease aetiology during surgery and (5) More than 42 million Americans undergo surgery each year and even though the perioperative complication rate is low, the absolute numbers are large. Although we will focus on high-risk surgery patients, we will collect data on the whole surgical population such that we will also develop a novel database on low-risk patients. They are an opportunity cohort to explore shared risk to compare with the high-risk patients.
First, this proposal starts as a retrospective analysis of UCI data of patients with or without CRI. Limiting our variables to those available to us in retrospective analyses may limit the ultimate prediction compared with including more variables that we may find useful beyond this first pass. However, in our prospective UCLA and University of Pittsburgh Medical Center (UPMC) data collection interval, if specific parameters appear useful, we will prospectively use them in a non-protocol fashion in patients because we are those patients’ providers. Second, the modelling of variables may not allow for discrimination of CRI caused by specific diagnoses, rather than by pathophysiological processes. Still this would be an improvement over existing bedside monitoring analysis. Since we treat these surgical patients and CRI is a relatively uncommon event we will annotate patients’ records weekly as to CRI and specific aetiologies for retrospective analysis. Third, our medical centres are installing under our leadership high-density data collection systems (Bernoulli or Bedmaster) in all ORs. We have not used these data formatting/data synthesis platform before in the OR at UCLA and UPMC, but the system has been used at UCI and we have the expertise for its installation in SDU and ICU at UCLA and UPMC. We have planned for 6 months to reformat the data collection and secured query system as needed at UCLA and UPMC and budgeted for data collection and secured processing personnel for this support throughout the duration of the study. Based on the funding cycle criteria, our system could be installed at UCLA (07/2018) before this project becomes active. Fourth, we planned for Honest Broker recording efforts for EHR review similar to ontology analysis already being done for another funded protocol (R01 NR013912). Accordingly, we have planned for a 6-month lead-in to insure she/he is cognizant of the UCLA and UPMC EHR idiosyncrasies. Finally, the CDS and GUI development will be inherently limited. We plan on only relying on board certified anaesthesiologists and certified nurse anaesthetists from UCLA for the iterative development of the CDS and GUI and this may limit the external validity of this system. Whether trainees could use this system appropriately would still have to be studied.
Ethics and dissemination
Once the investigation has been completed, we intend to publish the results in a peer-reviewed publication. We also intend to present the results of this work at professional conferences for both the anaesthesiology and computer science communities. In accordance with the recent proposal from the International Committee of Medical Journal Editors, patient-level data will be made available within 6 months after publication of the primary manuscript. Data will be provided to researchers who submit a methodologically sound research proposal including a protocol and statistical analysis plan. No patient identifying fields (including dates) will be included in the shared dataset. Age will be provided in years, unless the patient is older than 89 years. In this case, age will be reported as ‘>89 years’. Any dates will be presented as ‘number of days since index surgery’.
Contributors MC: draf the manuscript and final approval of the manuscript. IH: draft the manuscript and final approval of the manuscript. JR: draft the manuscript and final approval of the manuscript. PB: draft the manuscript and final approval of the manuscript. CL: draft the manuscript and final approval of the manuscript. KS: draft the manuscript and final approval of the manuscript. AD: draft the manuscript and final approval of the manuscript. MRP: draft the manuscript and final approval of the manuscript.
Funding This work is supported by NIH R01 HL144692.
Competing interests MC is a consultant for Edwards Lifesciences and Masimo, and has funded research from Edwards Lifesciences and Masimo. He is also the founder of Sironis owns patents and receive royalties for closed loop haemodynamic management that have been licensed to Edwards Lifesciences. MC department receives funding from the NIH (R01GM117622; R01 NR013012; U54HL119893; 1R01HL144692). JR is the founder of Sironis owns patents and receive royalties for closed loop haemodynamic management that have been licensed to Edwards Lifesciences. CL is an engineer at Edwards Lifesciences.
Patient and public involvement statement Patients/public were not involved in the design of this study.
Patient consent for publication Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.