Article Text

Original research
Using machine learning to identify quality-of-care predictors for emergency caesarean sections: a retrospective cohort study
  1. Betina Ristorp Andersen1,
  2. Ida Ammitzbøll1,
  3. Jesper Hinrich2,
  4. Sune Lehmann2,
  5. Charlotte Vibeke Ringsted3,
  6. Ellen Christine Leth Løkkegaard1,
  7. Martin G Tolsgaard4
  1. 1Department of Gynecology and Obstetrics, Nordsjællands Hospital & Department of Clinical Medicine, University of Copenhagen, Hillerod, Capital Region, Denmark
  2. 2Cognitive Systems, Department of Applied Mathematics and Computer Science, Technical University of Denmark, Lyngby, Denmark
  3. 3Faculty of Health, Aarhus Universitet, Aarhus, Denmark
  4. 4Copenhagen Academy of Medical Education and Simulation, Rigshospitalet, Kobenhavn, Capital Region, Denmark
  1. Correspondence to Professor Ellen Christine Leth Løkkegaard; Ellen.Christine.Leth.Loekkegaard{at}


Objectives Emergency caesarean sections (ECS) are time-sensitive procedures. Multiple factors may affect team efficiency but their relative importance remains unknown. This study aimed to identify the most important predictors contributing to quality of care during ECS in terms of the arrival-to-delivery interval.

Design A retrospective cohort study. ECS were classified by urgency using emergency categories one/two and three (delivery within 30 and 60 min). In total, 92 predictor variables were included in the analysis and grouped as follows: ‘Maternal objective’, ‘Maternal psychological’, ‘Fetal factors’, ‘ECS Indication’, ‘Emergency category’, ‘Type of anaesthesia’, ‘Team member qualifications and experience’ and ‘Procedural’. Data was analysed with a linear regression model using elastic net regularisation and jackknife technique to improve generalisability. The relative influence of the predictors, percentage significant predictor weight (PSPW) was calculated for each predictor to visualise the main determinants of arrival-to-delivery interval.

Setting and participants Patient records for mothers undergoing ECS between 2010 and 2017, Nordsjællands Hospital, Capital Region of Denmark.

Primary outcome measures Arrival-to-delivery interval during ECS.

Results Data was obtained from 2409 patient records for women undergoing ECS. The group of predictors representing ‘Team member qualifications and experience’ was the most important predictor of arrival-to-delivery interval in all ECS emergency categories (PSPW 25.9% for ECS category one/two; PSPW 35.5% for ECS category three). In ECS category one/two the ‘Indication for ECS’ was the second most important predictor group (PSPW 24.9%). In ECS category three, the second most important predictor group was ‘Maternal objective predictors’ (PSPW 24.2%).

Conclusion This study provides empirical evidence for the importance of team member qualifications and experience relative to other predictors of arrival-to-delivery during ECS. Machine learning provides a promising method for expanding our current knowledge about the relative importance of different factors in predicting outcomes of complex obstetric events.

  • maternal medicine
  • fetal medicine
  • adult surgery

Data availability statement

No data are available.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • The application of a penalised and bias corrected machine-learning approach which is advantageous in explorative studies for identifying patterns in large and complex data set.

  • The statistical approach included penalised regularisation that imposes a penalty on the model by shrinking the contribution of extreme and less important predictors.

  • Given the exploratory nature of this approach and that we only considered caesarean sections in one hospital, our findings need to be reproduced to determine which are context-specific and which are generalisable.

  • Category one and two emergency caesarean sections were analysed together in order to compare our results to the international literature.


In Western Europe and North America, the safety of childbirth is generally high.1 Maternal and fetal mortality during childbirth is infrequent,1 leading to an increased focus on the prevention of maternal and fetal morbidity by improving quality of care.2 In developed countries, emergency caesarean sections (ECS) are associated with up to five times increased risk of maternal morbidity compared with vaginal delivery.3

An ECS is a time-sensitive, unplanned surgical procedure often performed in pregnancies, where vaginal delivery was first planned or attempted. To increase safety for the mother and the unborn child, monitoring the decision-to-delivery interval (DDI) has become an important part of obstetric audit.4 Key factors to reduce DDI suggested by prior studies include clinical urgency category,5–7 type of anaesthesia used,7 8 cervical dilatation,7 previous delivery by caesarean section,6 maternal obesity9 and whether the ECS was performed during the day or nighttime.6 7 Additionally, communication skills between team members,5 seniority of the surgeon6 7 and midwifery staff level8 have been found to correlate to quality of care.

Each study has provided information about single predictors of quality of care in ECS. However, to our knowledge, no studies have compared and determined the relative importance of these predictors in determining critical ECS outcomes. Advances in machine learning and statistics, such as the use of penalised regression models, are particularly well-suited for analysing large and complex data sets to identify, to find patterns and make predictions that are often misrepresented by traditional statistical techniques.10

DDI is an internationally recognised quality indicator that represents a series of events during four time intervals: Interval I, the decision by the obstetrician to transfer to the operating room; Interval II, patient arrival in the operating room until induction of anaesthesia; Interval III, from anaesthesia induction to surgical incision; and Interval IV, from surgical incision to delivery of the infant.11 Intervals II–IV (arrival-to-delivery) are characterised by significant complexity due to the multidisciplinary team, which includes anaesthesia staff and scrub nurses, in addition to obstetricians.

The aim of our study was to identify the most important factors contributing to quality of care during ECS based on assessments of the arrival-to-delivery (AD) interval using a machine learning approach.

Material and methods

Data from 2892 patient records of mothers undergoing ECS between 1 January 2010 to 16 March 2017 at the Department of Gynecology and Obstetrics, Nordsjællands Hospital, Denmark, were included in a retrospective cohort study.

Caesarean sections were categorised based on urgency into four categories12: ‘Immediate threat to the life of the woman or the fetus (category one)’; ‘Maternal or fetal compromise not immediately life-threatening (category two)’; ‘Needing early delivery but no maternal or fetal compromise (category three)’; ‘Delivery at a time suitable to the patient and maternity team (category four)’.12 Our study only included category one, two and three ECS, in which urgency and DDI are considered of importance.

The ECS team aims to deliver the baby as quickly as possible without subjecting the mother to unnecessary risk. In Denmark, the standard goal is to deliver the baby within 15 min (following the decision by the obstetrician to pursue ECS) for category one and 30 min for category two ECS.13 In the USA and Great Britain, time recommendations for DDI intervals are 30 min.11 ,13–15 To compare our results to the international standard, we analysed data from categories one and two together.

Predictor variables

Demographic data collected from patient records included maternal age, gestational age at delivery and parity.

Predictor variables were selected, based on a literature review5–9 11 13 16–21 (online supplemental table 1). We searched the MEDLINE database and Google Scholar for publications describing quality of care predictors for ECS using terms as ‘quality-of-care’, ‘Emergency-Caesarean-sections’, ‘decision-to-delivery’, ‘Anaesthesia’ and ‘Obesity’.5–9 11 13 16–21 We hand searched references in key publications to help inform the content and selections of predictor variables in our study.

Predictors were grouped under the following headings: ‘Maternal objective predictors’, N=21; ‘Maternal psychological’ predictors, N=4; ‘Fetal predictors’, N=7; ‘Indication for ECS’, N=11; ‘Emergency category of ECS’, N=5; ‘Type of anaesthesia’, N=5; and ‘Procedural predictors’ N=2. Team members responsible for patient care were identified by name and title. The group ‘Team member qualifications and experience’ included educational characteristics extracted from the Danish Health Authorities and years of employment in the department. The qualifications for the individual team member were updated yearly. For this data set, a physician in his or her first year of training was identified as a resident, whereas a trainee in the second to fifth year of training was identified as a senior resident. A designation of obstetrician consultant indicated a physician who is engaged full-time in caring for obstetric patients, whereas a gynaecological consultant referred to a physician who is responsible for outpatient gynaecologic patients and surgical procedures during the day but takes the nighttime call on labour and delivery.

To assess the importance of experience, we registered the number of ECS performed by a team member during the year before the ECS study case. Additionally, the presence of an extended team was included in the group ‘Team member qualifications and experience’, N=37 (online supplemental table 1).

Primary outcome

AD interval represents DDI II–IV. We chose AD as our primary outcome to focus on aspects of care that can be improved in an OR setting. This outcome variable excludes DDI Interval I that represents the time from the decision to perform ECS until transfer to the operating room (labour ward and transfer).11 Local hospitals vary considerably in terms of standard protocols for Interval I. Moreover, the complex multidisciplinary operating room teams do not play a role in Interval I. AD interval was measured in minutes. Intervals above 90 min were excluded from analysis (N=64) as values above this threshold were likely to be recording errors.

Statistical methods

We were interested in identifying the most important predictor variables out of a large set of 92 factors related to each ESC and the associated operating room team.

Due to the large number of predictor variables relative to the total number of ECS, we applied a machine learning method with predictor selection. Specifically, we use linear regression with elastic net regularisation22 23 and split the data 80–20 into development and test observations, the latter intended to simulate newly acquired observations. Estimated model parameters were based solely on the development set observations. This set was analysed using leave-one-out cross-validation/jackknife sampling24 to estimate predictor variable importance. For each jackknife sample, fivefold cross validation25 was used to estimate the optimal Elastic Net parameter. See Statistical Appendix (SA) for full details.

We estimated the importance of predictor variables using percentage significant predictor weight (PSPW).26 The PSPW is simply a linear scaling of the significant predictor weights estimated by the model, such that the sum of all absolute PSPW values is 100. Larger absolute PSPW implies greater influence.

Patient and public involvement

The research question explored in this study is highly relevant to all future mothers undergoing ECS. The study provides insight into the most important predictors for quality-of care and could inform future quality improvement measures. The study is based on prior studies of patient-perceived quality of care in the simulated and clinical setting.26 27 Patients were not involved in the establishment of the present retrospective cohort study.


The study included data from 2892 patient records. Sixty-four ECSs were excluded because of an AD interval >90 min, 150 ECSs were discarded due to data inconsistency (zero or negative AD interval, impossible or clearly erroneous predictor variable combinations, no category label) and 135 due to multiple pregnancy (which could influence analysis in a non-obvious way). Of the remaining 2543 ECS, 134 were categorised as category four (planned caesarean section) and therefore withdrawn from further analysis. Of the remaining 2409 ECSs, 165 fell into category one, 1022 in category two and 1222 in category three.

Descriptive data for included mothers (maternal age, body mass index (BMI), parity, prior caesarean section (CS), gestational age and fetal weight) are shown in table 1.

Table 1

Descriptive data of characteristics by emergency caesarean section category

All ECSs were performed by at least two surgeons, however, there were 709 ECSs with no name registered for ‘surgeon 2’, for whom the competency level could not be determined. The pairwise surgeons’ competencies and the corresponding AD interval are illustrated in table 2. In ECS category one/two, 46% were performed by an attending gynaecologist or obstetrician as surgeon 1% and 54% were performed by a resident. In category three, 42.3% were performed by an attending and 57.7% were performed by a resident.

Table 2

Surgeon pairs and arrival-to-delivery interval in emergency category one/two and three

The ECSs were analysed in four groups: only category one, only category two, only category three and category one/two. The estimated Elastic Net models outperformed the null models and captured patterns in the data not accounted for by the baseline (median outcome on the development set) or random variation (permutation testing) in all four groups. In the majority of the ECS groups (3 out of 4), the Elastic Net models more accurately predicted (R2 >0) the test observations than using their true (but unknown) mean outcome (R2=0). This means that the patterns captured by the model can be generalised to unseen data examples (new ECS), as simulated by holding out 20% of the ECS for testing.

When modelling only ECS category one, the model did not outperform the true mean outcome. This group had the lowest number of observations, that is, N=140 for development and N=25 for the test set.

The test set performance for the ECS category two was only slightly improved and was not analysed further, that is, AD Root Mean Square Error (RMSE=6.01 R2=0.05). The performance in ECS emergency category three was RMSE=9.05, R2=0.11, and combined ECS emergency category one/two RMSE=5.61, R2=0.24.

Prediction results

The most important group of predictors associated with AD interval was ‘Team member qualifications and experience’ for all emergency categories (PSPW 25.9% for category one/two and PSPW 35.5% for category three). The second most important predictor group was ‘Indication for ECS’ in category one/two ECS (PSPW 24.9%) and ‘Maternal Objective predictors’ in category three ECS (PSPW 24.2%). The relative importance measured by PSPW is shown in figures 1 and 2 for ECS categories one/two and three, respectively.

Figure 1

Percentage significant predictor weight (PSPW) for emergency category one/two. Arrival-to-delivery interval PSPW for all predictor groups emergency caesarean section category one/two: ‘Team member qualifications and experience’= 25.9%, ‘Indication for CS’= 24,9, ‘Maternal objective’=14%, ‘Type of anaesthesia’=12.3%, ‘CS emergency category’=8.7%, procedural predictors=7.1%, ‘Maternal psychological’=4.5%, ‘Fetal’ predictors=2.6%. CS, caesarean section.

Figure 2

Percentage significant predictor weight (PSPW) for emergency category three. Arrival-to-delivery interval PSPW for all predictor groups emergency caesarean section category three: ‘Team member qualifications and experience’=35.5%, ‘Maternal objective’=22.2%, ‘Type of anaesthesia’=9.6%, ‘Fetal’ predictors=9.1%, procedural predictors=7.3%, ‘Indication for CS’=7.3, ‘Maternal psychological’=6.9%, emergency category=0%. CS, caesarean section.

Individual predictor weights are shown in online supplemental table 2.

Surgical and delivery difficulties were strong predictors of longer AD intervals in both ECS category one/two and three.

In emergency category one/two, the level of seniority of Surgeon 2 was a strong predictor of AD interval. Having a gynaecologist or obstetrician as Surgeon 2 predicted shorter AD, whereas having a trainee as the assistant predicted longer AD. In ECS category three, the level of seniority of Surgeon 1 was the strongest predictor. Having a gynaecologist or obstetrician as Surgeon 1 predicted shorter AD, whereas having a trainee as Surgeon 1 predicted longer AD. Also, the presence of three or more surgeons was a strong predictor of longer AD.

Regarding the other team members who were not surgeons, having two or more members of a specialty, that is, anaesthesia nurse, anaesthesiologist or circulating nurse, corresponded to a higher Persentage Significant Predictor Weight (PSWSs) than their PSPWs corresponding to their seniority; that is, the number of non-surgeon team members was more important than their experience levels.

‘Last year CS-experience’ corresponded with relatively low PSPWs in general (<1.0%), the highest weights were for Surgeon 1 and the anaesthesiologist, both associated with shorter AD intervals.

Indication for CS related to high emergency threats, that is, placental abruption, fetal stress, uterine rupture and cord prolapse, were important predictors of shorter AD interval in ECS category one/two, whereas maternal request/exhaustion and planned elective CS predicted longer AD intervals in all ECS categories.

Placenta previa was a strong predictor of prolonged AD interval in ECS category three (PSPW 6.4%) but had less importance in ECS category one/two (PSPW 1.1%). BMI had low importance in all emergency categories, whereas prior CS predicted prolonged AD intervals in ECS category three.

‘Fetal predictors’, such as fetal weight or descent of fetal head had little importance in ECS with high emergency pressure, whereas in ECS category three, fetal descent to the pelvic floor was associated with a longer AD interval (PSPW 4.2%).

The category/group ‘Type of anaesthesia’ was associated with high PSPW. Epidural anaesthesia was associated with a shorter AD interval in all ECS emergency categories. Change of anaesthesia, that is, from epidural or spinal to general anaesthesia, predicted a longer AD interval.


In this study, we explored the association between numerous factors, as predictors of the AD interval in ECS. Our use of machine learning made it possible to analyse many predictors simultaneously and, in turn, to identify those that were of greatest relative importance.

Our model demonstrated that ‘Team member qualifications and experience’ was by far the most important group of predictors associated with AD interval in all ECS emergency categories (one/two and three). ‘Maternal objective predictors’ and ‘Type of anaesthesia’ also emerged as critical factors in all ECS categories, whereas ‘Procedural predictors’ appeared to be a less important predictor (medium weight). In ECS category one/two, ‘Indication for ECS’ demonstrated high predictor weight, whereas ‘Fetal predictors’ was a low weight factor. By contrast, in emergency category three, the group of ‘Fetal predictors’ was shown to be of increased importance.

A strength of our study was the use of a machine-learning approach to analyse handling a large and complex data set. A penalised regression approach in the statistical analysis helped identify the most important associations within a large set of predictor variables (online supplemental appendix 1). However, concerns exist including issues of correlated features and overfitting. To mitigate the effect of correlated features, we used jackknife bias-correction, and to avoid overfitting we used cross-validation. This method has been described and used in a previous exploratory study,26 in which the primary objective was to navigate and explore through a large and complex data set. Given the exploratory nature of this approach, our findings need to be reproduced to determine which are context-specific and which are generalisable.

In this study the outcome was the AD interval, which includes three smaller intervals (DDI II: arrival in the operating room until induction of anaesthesia; DDI III: induction of anaesthesia to surgical incision; DDI VI: surgical incision to delivery of the infant).11 Our data did not allow subanalyses in DDI II, DDI II or DDI IV as any effects may be dwarfed due to variability and lack of power.

The importance of individual team member’s qualifications in healthcare has been demonstrated in prior studies,27–30 that is, availability of a trained midwife, having obstetricians and anaesthetists on-site near the labour ward5 8 and experience of the surgeon.6 In our study, we included the entire multidisciplinary team of surgeons, anaesthetists, nurses and midwives. Although the surgeons contributed most to the model, our inclusion of all professionals on the team expands on existing knowledge and emphasises that all team members have an impact on AD intervals during ECS.

In our clinical context, only 5% of the included ECS were performed solely by obstetrics/gynaecology specialists. Former studies report longer operating time if a resident participates in an operation.31–33 However, in one study focusing exclusively on caesarean sections, resident participation had little impact on OR-time, and resident participation was recommended even in difficult ECS.31 In our study with extensive resident involvement in ECS, we did not find that this factor predicted a longer AD interval. Although the study was not designed to directly assess the impact of resident involvement, our findings did not suggest any detrimental effects of resident participation in ECS.

We measured individual experience by how many ECS they had participated in the preceding year for each professional. Individual competency had little impact on AD in our statistical model across all ECS categories. This finding is somewhat surprising given that surgeons are thought to continuously improve their skills through repeated and deliberate practice.34–37 Our study focused exclusively on emergency CS. Hence, the variability in the context and composition of teams may outweigh factors related to variations in individual skill levels.

Procedural predictors, such as surgical and delivery difficulties, had high predictor weight in all emergency categories and were associated with longer AD intervals. Surgical difficulties encompass a woman’s previous history and anatomy, including previous abdominal or pelvic surgeries, intraabdominal adhesions and prior CS.38 39 Delivery difficulties include situations such as an impacted fetal head requiring an assistant to push the head of the child upwards before delivery or a transition to breech delivery.40 The reason for surgical and delivery difficulties includes a combination of individual competences, team-competences and maternal characteristics.5 38–41 Our study adds to existing knowledge by demonstrating the relatively high importance of delivery and surgical difficulties compared with other predictors for AD.

In our study, ECS indication was an important predictor in category one/two ECS. Indications such as maternal request and planned CS were associated with longer AD interval, and more urgent indications for ECS as ‘placental abruption’, ‘fetal stress’, ‘rupture of a uterus scar’ or ‘cord prolapse’ were associated with a shorter AD interval. Other studies have reported similar associations.7 42 However, our results add to our existing knowledge by demonstrating their relative importance as compared with maternal objective predictors and the applied type of anaesthesia.

In the group of maternal predictors, we found a strong association between prior caesarean section and placenta previa with longer AD interval. Our finding of prolonged AD interval in cases of prior CS is supported by other studies.43 Placenta previa is associated with adverse maternal outcomes, such as excessive bleeding and postpartum hysterectomy.44 Consequently, in low urgency ECS, the team will choose to prepare for excessive bleeding before incision resulting in a prolonged AD interval. Interestingly, BMI, and maternal age demonstrated little importance in our model. Other studies have found a prolonged decision-to-delivery interval in association with maternal age44 and obesity.17 44–46 Local guidelines for prophylactic epidural analgesia during delivery in obese women may have influenced our findings.

Clinical implications

The findings in our study provide new information about the importance of team member qualifications and experience in a statistical model, which enabled us to include many predictor variables representing different aspects of ECS. Although we included multiple known predictors, such as maternal characteristics and psychological predictors, fetal predictors and anaesthesia-related predictors, the group of predictors including team member qualifications and experience was found to be of greater importance than we had anticipated. Our findings strengthen the importance of team competencies during a complex/multifaceted event, such as an emergency caesarean section. Training of individual competencies and/or team-performance should have a higher priority when the aim is to reduce arrival-to-delivery interval. Resources should be prioritised for skills-training and team-training to ensure low AD interval, whenever possible. Finally, using statistical models such as those developed in our study for providing feedback to the ad-hoc team during and after managing an ECS may provide valuable real-time information. Such feedback can be used to guide and improve future performances of ad-hoc surgical teams during clinical practice or as input for team training interventions in a simulated setting.

Research implications

Our use of machine learning to analyse the data provided the opportunity to uncover patterns in a complex obstetric data set. Our approach was intended to provide a hierarchy of variables that were most relevant to the outcome. Even though machine learning is used in several areas of healthcare, it is often criticised for being a ‘black box’ approach and the results are often met with scepticism.47 In our study, we sought transparency in several ways. First, the predictors included in our statistical model were based on prior studies, which had demonstrated associations using traditional statistics. Moreover, we only included context-relevant predictors, carefully evaluated by a panel of expert obstetricians, to ensure that all analysed predictors were meaningful to AD. Second, we analysed single predictors instead of pooling the predictors, which also facilitated transparency of the model findings. Third, after the models provided information about important predictors, specialists within our research group (anaesthetists, obstetricians, gynaecologists) performed an additional check to ensure plausibility. Finally, to ensure transparency and provide the opportunity to verify our results, we have included a statistical appendix (online supplemental appendix 1).


This study provides empirical evidence for the importance of team member qualifications and experience relative to other predictors affecting AD during ECS. Machine learning-based analyses provide a promising method for expanding our current knowledge about the relative importance of the different predictors of complex and multifaceted obstetric events. Moreover, this methodology may help gain insight into other aspects of obstetric care quality.

Data availability statement

No data are available.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants and was approved by Danish Patient Safety Authority (ID# 3-3013-1732/1). Data was collected from patient records.


We thank Jesper Friis Petersen for setting up the database.


Supplementary materials


  • Contributors BRA contributed to the conception and design of the work, and to data acquisition and interpretation, data analysis and drafted the paper. IA contributed to data acquisition and interpretation and assisted in drafting the paper. MGT and ECLL contributed to the conception and design of the work and to data acquisition and interpretation, data analysis and assisted in drafting the paper. JH and SL contributed to data analysis and interpretation and assisted in the drafting of the paper. CVR contributed to the conception and design of the work and assisted in drafting the paper. All authors contributed to the critical revision of the paper and approved the final manuscript for publication. All authors have agreed to be accountable for all aspects of the work. BRA acts as a gaurantor of the study.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.