Article Text

Original research
Using cluster analysis to describe phenotypical heterogeneity in extremely preterm infants: a retrospective whole-population study
  1. Theodore Dassios1,2,
  2. Emma E Williams1,
  3. Christopher Harris1,2,
  4. Anne Greenough1,3
  1. 1Department of Women and Children's Health, School of Life Sciences, Faculty of Life Science and Medicine, King’s College London, London, UK
  2. 2Neonatal Intensive Care Centre, King’s College Hospital NHS Foundation Trust, London, UK
  3. 3Asthma UK Centre for Allergic Mechanisms in Asthma, King’s College London, London, UK
  1. Correspondence to Dr Theodore Dassios; theodore.dassios{at}kcl.ac.uk

Abstract

Objective To use cluster analysis to identify discrete phenotypic groups of extremely preterm infants.

Design Secondary analysis of a retrospective whole population study.

Setting All neonatal units in England between 2014 and 2019.

Participants Infants live-born at less than 28 weeks of gestation and admitted to a neonatal unit.

Interventions K-means cluster analysis was performed with the gestational age, Apgar score at 5 min and duration of mechanical ventilation as input variables.

Primary and secondary outcome measures Bronchopulmonary dysplasia, discharge on home oxygen, intraventricular haemorrhage, death before discharge from neonatal care.

Results Ten thousand one hundred and ninety-seven infants (53% male) were classified into four clusters: Cluster 1 contained infants with intermediate gestation and duration of ventilation and had an intermediate mortality and incidence of bronchopulmonary dysplasia. Cluster 2 contained infants with the highest gestation, a shorter duration of ventilation and the lowest mortality. Cluster 3 contained infants with the lowest Apgar score and highest mortality and incidence of intraventricular haemorrhage. Cluster 4 contained infants with the lowest gestation, longest duration of ventilation and highest incidence of bronchopulmonary dysplasia.

Conclusion Clinical parameters can classify extremely preterm infants into discrete phenotypic groups with differing subsequent neonatal outcomes.

  • neonatal intensive & critical care
  • neonatology
  • respiratory physiology

Data availability statement

Data are available upon reasonable request.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • First study to use cluster analysis to classify extremely preterm infants in different phenotypical groups.

  • We used readily available clinical parameters to classify extremely preterm infants into distinct phenotypic groups.

  • The ensuing groups had differing neonatal outcomes such as survival to discharge from neonatal care and bronchopulmonary dysplasia.

  • We used the whole population rather than a representative sample, making our results more generalisable.

  • Our cluster analysis is not an early prediction study as we have included as input variable the duration of ventilation, which might be prolonged in the life of an extremely preterm infant.

Introduction

Preterm birth is a major cause of childhood morbidity and mortality with a global incidence of approximately 10.6%.1 Infants born at less than 28 completed weeks of gestation (extremely preterm) suffer significant multisystem morbidity lasting into adolescence and adulthood,2 3 with important financial implications for health systems4 and an increased physical, emotional and financial burden for the families.5 The survival of these infants has increased over the past decades6 and the associated implications for patients and families are only expected to increase in the future.

Although many extremely preterm infants suffer significant perinatal and long-term morbidity,7 there is substantial variation in the severity of the neonatal outcomes with a sizeable proportion escaping major morbidity. For example, we have recently reported the overall mortality and incidence of bronchopulmonary dysplasia (BPD) of extremely preterm infants in England to be 18.9% and 57.5%, respectively, which means that 81.1% of those infants survived to discharge and 23.6% were free of BPD.8 Although the multifactorial origin of adverse neonatal outcomes has been well described,9 the heterogeneity in neonatal outcomes is poorly understood.

The use of unsupervised learning based on big-data analytics has been used in the clinical domain for the prediction of individual risk factors and for clinical decision support.10 Unsupervised learning could also be helpful to identify different phenotypic groups (clusters) corresponding to distinct clinical phenotypes of extreme preterm infants. Cluster analysis as a method of detecting subgroups within multidimensional data sets could also be helpful in identifying subgroups of patients that might benefit from targeted interventions.11 12 To our knowledge, such unsupervised learning approaches have not been previously employed to describe different phenotypical groups in extreme prematurity.

We aimed to derive and describe clusters of extremely preterm infants based on readily available clinical information and to explore whether these distinct phenotypes were associated with different neonatal outcomes.

Materials and methods

Study design and participants

All infants live-born before 28 completed weeks of gestation and admitted to a neonatal unit in England between 1 January 2014 and 1 January 2019 (study period) were included. This was an analysis of data previously acquired to investigate the relationship of growth impairment with the development of BPD in a retrospective, whole-population study.8 A predefined data set was acquired from the National Neonatal Research Database (NNRD), Imperial College London, UK. As the study used data held in an existing database, participation did not require approval from individual Trusts, but only from the NHS Trust holding the database (Chelsea and Westminster NHS Foundation Trust) which was obtained.

The following variables were collected: maternal age (years), administration of antenatal steroids (yes/no), gestational age at birth calculated from the last menstrual period or ultrasonographically (weeks), birth weight (kg), Apgar score at 5 min of age, sex (male/female), duration of invasive ventilation (days), BPD development defined as any need for respiratory support at 36 weeks postmenstrual age (PMA),13 14 administration of postnatal corticosteroids (defined as parenteral administration of dexamethasone or hydrocortisone for more than 5 consecutive days—yes/no), surgical intervention for necrotising enterocolitis (yes/no),15 surgical ligation of patent ductus arteriosus (yes/no),16 intraventricular haemorrhage (IVH) grade 3–4 (yes/no),17 periventricular leucomalacia (yes/no),18 death before discharge from neonatal care (yes/no), age at death (days), weight at 36 weeks PMA (kg), PMA at discharge (weeks), weight at discharge (kg), discharged on home oxygen (yes/no). We calculated the birth weight z-score (ΔWz) using the UK-WHO preterm reference chart19 and the Microsoft Excel add-in LMS Growth (V.2.77; www.healthforchildren.co.uk). The birth weight z-score was not calculated for infants born <23 completed weeks of gestation, as there were no reference data.

PPI statement

This is a retrospective study, so patients and public could not be involved.

Analysis

Cluster analysis was performed with the smallest possible number of continuous variables necessary to adequately characterise neonatal outcomes, since inclusion of a large number of variables might degrade the final classification.11 20 K-means cluster analysis was performed with gestational age, Apgar score at 5 min and duration of mechanical ventilation as input variables. These parameters were selected, as they were independent, contributory to neonatal outcomes and pathophysiologically distinct between them. All results were standardised as z-scores prior to clustering. To select the number of clusters, clustering was performed for solutions comprising 2 to 10 clusters and the optimal solution was selected based on the minimum number of iterations that changed the cluster centres and the optimal F factor and significance of the corresponding Analysis of Variance (ANOVA) analysis. The F factor for each ANOVA cluster solution was calculated as the ratio of two variances, and a higher F factor corresponded to a stronger separation of the clusters.21 22 Subjects with missing values in any of the three input variables were excluded from the analysis. Continuous variables were compared across clusters using Kruskal-Wallis with Mann-Whitney U test as a post hoc test for pairwise comparisons. Categorical variables were compared across clusters using χ2 test with Bonferroni adjustment as a post hoc test for pairwise comparisons. Cluster profiles were presented graphically using radial plots, where the length of each ‘spoke’ was proportional to the magnitude of the standardised variable. The clustering was also visualised using a discriminant-coordinates biplot, generated by canonical variate analysis, which projects multidimensional data into a lower dimensional space while preserving as much information as possible, to provide an easily interpretable two-dimensional representation of cluster separation and density.

Cluster and statistical analysis were performed using SPSS software, V.26.0 (IBM, Armonk, New York).

Results

During the period of the study, 11 806 infants were born alive below 28 completed weeks of gestation and admitted to a neonatal unit in England. One thousand five hundred and eighty-five infants were excluded for missing data on Apgar score in 5 min. Twenty-one infants were removed for missing data on the duration of mechanical ventilation and a further three infants because of indeterminate sex (figure 1). A total of 10 197 infants (53% male) were included in the final analysis with a median (IQR) gestational age of 26.0 (24.9–27.1) weeks and birth weight of 0.81 (0.68–0.96) kg (table 1). They were ventilated for a median of 10 (3–26) days and the ones that survived to discharge from neonatal care (81%) were discharged at a median (IQR) PMA of 39.6 (37.6–42.1) weeks (table 1).

Figure 1

Flow diagram of the included infants.

Table 1

Subject characteristics by cluster

A four-cluster solution was found to be optimal (table 1). The clinical profiles for the four clusters are presented graphically in figures 2 and 3. The boxplots of the input parameters in the four clusters along with the birth weight a-score, the duration of PN and the PMA age at discharge are presented in figure 4.

Figure 2

Radial plots showing the clinical profiles for the four clusters. Data are standardised (expressed as z-scores referenced to the whole cohort) and the spike points represent medians.

Figure 3

Discriminant two-dimensional plot of the clustering solution. The cluster centroids are presented as black circles.

Figure 4

Boxplots of the gestational age (A), Apgar score at 5 min (B), duration of ventilation (C), birth weight z-score (D), duration of parenteral nutrition (PN) (E) and postmenstrual age (PMA) at discharge (F) in the four clusters. The median value for each parameter for the whole population is depicted as a dashed line. The whiskers represent the maximum and minimum values. The asterisks are extremes (at least three box lengths from the median) and the circles are outliers (1.5 box lengths from the median).

Cluster 1 (intermediate) contained 27% of the population. The infants of cluster 1 had a lower median (IQR) gestational age (24.8 (24.1–25.2) weeks) compared with clusters 2 and 3 (27.0 (26.6–27.6) weeks and 25.4 (24.4–26.4) weeks respectively, p<0.001 for both). The median Apgar score for infants of cluster 1 was higher compared with the infants of clusters 3 and 4 (p<0.001 for both), but not of cluster 2 infants. Infants in cluster 1 were ventilated for a median (IQR) period of 21 (10–33) days, significantly longer that infants of clusters 2 and 3 (4 (2–10) and 9 (3–21) days, respectively, p<0.001 for both) but significantly shorter than the infants of cluster 4 (60 (52–73) days, p<0.001). Infants of cluster 1 developed more often BPD (59%) compared with infants of clusters 2 and 3 (48% and 42%, respectively, p<0.001) but less often than infants of cluster 4 (89%, p<0.001). Infants of cluster 1 had lower mortality (27%) compared with cluster 3 (40%, p<0.001) but higher mortality compared with cluster 2 (8%, p<0.001).

Cluster 2 (favourable) contained 47% of the population and had the highest median (IQR) gestational age (27.0 (26.6–27.6) weeks) compared with groups 1, 3 (25.4 (24.4–26.4) weeks, p<0.001) and 4 (24.4 (23.9–25.3) weeks, p<0.001). The median Apgar score for infants of cluster 2 was higher compared with the infants of clusters 3 and 4 (p<0.001 for both). Infants in cluster 2 had the shorter duration of ventilation with a median (IQR) period of 4 (2–10) days, compared with infants of cluster 1 (21 (10–33) days)], cluster 3 (9 (3–21) days) and cluster 4 (60 (52–73) days, p<0.001 for both). Infants of cluster 2 developed less often BPD (48%) compared with infants of clusters 1 and 4 (59% and 89%, respectively, p<0.001 for both), but more often than infants of cluster 3 (42%, p<0.001). Infants of cluster 2 had the lowest mortality (8%) compared with all other clusters (p<0.001).

Cluster 3 (low Apgar) contained 17% of the population and the infants with the lowest median (IQR) Apgar score (4 (2–5)) compared with all other groups (p<0.001 for all). Their median (IQR) gestation (25.4 (24.4–26.4) weeks) was higher than clusters 1 and 4, but lower than cluster 2 (p<0.001 for all). Infants of cluster 3 had a shorter median (IQR) duration of ventilation (9 (3–21)) days, compared with infants of clusters 1 and 4 (p<0.001 for both), but longer than infants of cluster 2 (p<0.001). Infants of cluster 3 had the lower incidence of BPD (42%) compared with infants of clusters 1, 2 and 4 (59%, 48% and 89%, respectively, p<0.001 for all). Infants of cluster 3 had the highest mortality (40%) compared with all other clusters (p<0.001).

Cluster 4 (prolonged ventilation) had the smallest number of subjects (8%), the lowest gestational age (24.4 (23.9–25.3) weeks) and the longest median (IQR) duration of ventilation (60 (52–73) days) compared with all other clusters (p<0.001 for all). The median Apgar score for infants of cluster 4 was 6 (5–8), which was higher compared with the infants of cluster 3 (p<0.001) but lower than clusters 1 and 2 infants (p<0.001 for both). Infants of cluster 4 had the highest rate of BPD (89%) compared with infants of all other clusters (p<0.001 for all). The mortality of the infants of cluster 4 (19%) was lower than clusters 1 and 3 (p<0.001 for both) and higher than cluster 2 (p<0.001).

Discussion

Using unsupervised learning to detect and describe distinct phenotypical groups of extremely preterm infants, we demonstrated that survival to discharge from neonatal care and other important neonatal outcomes differed significantly across these groups. We used readily available parameters to classify the extremely preterm infants into the four clusters.

Cluster 1 (intermediate) comprised infants with a combination of a low gestational age and a relatively long duration of invasive ventilation whose resuscitation was overall straightforward as demonstrated by a median Apgar of 8 at 5 min after birth. Nevertheless, they had a relatively high mortality and incidence of BPD. Cluster 2 (favourable) was the largest group and exhibited the best outcomes. It contained infants with the highest gestational age and birth weight z-score and the briefest duration of ventilation. Not surprisingly, they also had the lowest mortality and a relatively modest incidence of complications such as BPD. They were also the earliest to be discharged from neonatal care. The major clustering characteristic of the infants in cluster 3 was the low Apgar score. The poor condition at birth might explain why they had the highest mortality and rate of IVH compared with all other groups, but their relatively larger gestation and brief duration of ventilation might explain why they had the lowest incidence of BPD. Cluster 4 (prolonged ventilation) consisted of the most premature infants, with the longest duration of mechanical ventilation and highest rates of BPD. These infants were also discharged home at a later PMA than all other cluster infants.

Our results highlight two distinct within-cluster associations. First, poor condition at birth with subsequent high rates of mortality and IVH (cluster 3) and, second, the association of prolonged ventilation with BPD (cluster 4). Both these associations have been previously individually described in population studies: Jensen et al studied 3343 extremely low-birth weight infants and reported that among the survivors, exposure to a greater number of mechanical ventilation episodes was associated with a progressive increase in the risk of BPD.23 Interestingly, the highest mortality in our study was not in the infants of the lower gestation at birth, but rather in the ones with the lowest Apgar score. The association of poor condition at birth with IVH and increased mortality is primarily attributed to the intrinsic fragility of the germinal matrix vasculature and the disturbance in the cerebral blood flow.24 25 It is also interesting to note that the infants of cluster 3, other than the highest mortality and lowest Apgar score, also had the lowest rate of antenatal steroids and the lowest incidence of birth in a tertiary centre. This is in agreement with previous studies that have highlighted that delivery outside a tertiary centre is associated with higher mortality and a lower uptake of antenatal steroids.26 Our study complements the literature by presenting these associations within distinct groups of the same population rather than via a unilateral relationship that does not take into account the other outcomes.

The choice of our input variables (gestational age, Apgar score at 5 min and duration of mechanical ventilation) was made on clinical grounds and has evidently major implications for our conclusions. We selected to use these three parameters because they are continuous and reliably available at a population level, while they also represent relatively distinct pathophysiological processes that are associated with impaired later outcomes. It is well described, for example, that the Apgar score is influenced by gestation and cannot be generally considered as evidence of asphyxia nor can it be used at an individual level to predict mortality or neurological outcomes.27 Alternative continuous variables to assess the condition at birth such as the cord pH or lactate could not be used, however, as these indices were not consistently inserted in the database and had a very high percentage of missing values. Similarly, the duration of ventilation was selected as a proxy for the severity of lung disease28 which also can be assessed relatively early chronologically as the median duration of ventilation in our population was 10 days.

To our knowledge, non-supervised learning and cluster analysis have not been previously applied exclusively to extremely preterm infants. Souza et al used K-means cluster analysis to explore clinical conditions associated with preterm birth and reported that 4150 preterm births were clustered in three groups of women, yet although some maternal characteristics differed among the clusters, maternal and neonatal outcomes did not.29 This might be explained by population differences, as this was predominantly a maternity study in which the neonatal outcomes were not considered as input variables in the unsupervised models. Greenbury et al performed clustering on daily nutritional intakes in premature infants of less than 32 weeks of gestation and reported different nutritional clusters, which were heterogeneous in size, with some showing common interpretable clinical practices. They also identified a relationship between nutritional practice and outcomes such as BPD, with provision of human milk being a protective factor against developing BPD.30

Our classification could be clinically useful in further understanding the pathophysiology of extremely preterm morbidity and mortality and possibly in targeting selected subgroups of infants for specific interventions. For example, for cluster 4, such interventions might include administration of postnatal corticosteroids to assist earlier weaning from invasive respiratory support,31 more gentle resuscitation methods, avoidance of mechanical ventilation by newer non-invasive techniques32 and use of less invasive surfactant administration, which has been shown to reduce the incidence of BPD.33 The within-group association of increased mortality and birth outside a tertiary hospital in cluster 3 could also strengthen the argument for the centralisation of the care of extreme preterms concentrating on high-volume perinatal centres.34

Our study has strengths and some limitations. To our knowledge, this is the first study to use unsupervised learning to classify extremely preterm infants in different phenotypical groups. A further strength of our study was that we used the whole population rather than a representative sample and therefore, by avoiding inclusion bias, our results are more generalisable. We should clarify that our analysis is not an early predictive tool that could be applied at birth, as we have included the duration of ventilation as an input parameter, which comes later in the life of an extremely preterm infant compared with the Apgar score or the gestational age. As such, our study is aiming to describe the interaction of different pathophysiological phenomena at a population level rather than provide individualised prediction of outcomes at birth. We should also note that our study only included infants from a recent 5-year period rather than a longer period that would extend more in the past. Our tighter time period, though, would better correspond to current standards of neonatal care. We did not include in our analysis information on whether preterm birth was spontaneous or on the duration of rupture of membranes as the amount of missing data in these categories was very high at a national level, even more so in non-tertiary units.

In conclusion, we have used readily available clinical parameters to classify extremely preterm infants into distinct phenotypic groups. The ensuing groups had differing neonatal outcomes such as survival to discharge from neonatal care and BPD.

Data availability statement

Data are available upon reasonable request.

Ethics statements

Patient consent for publication

Ethics approval

A predefined dataset was acquired from the National Neonatal Research Database (NNRD), Imperial College London, UK. The NNRD is approved by the National Research Ethics Service (10/H0803/151), Confidentiality Advisory Group of the Health Research Authority (8-05[f]/2010) and the Caldicott Guardians and Lead Clinicians of contributing hospitals. As the study used data held in an existing database, participation did not require approval from individual Trusts, but only from the NHS Trust holding the database (Chelsea and Westminster NHS Foundation Trust) which was obtained. The primary study was approved by the West Midlands—Edgbaston Research Ethics Committee (REC reference: 19/WM/0172) and the UK Health Research Authority (HRA) (IRAS project ID: 259225). The research was conducted ethically in accordance with the World Medical Association Declaration of Helsinki.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Contributors TD: conceived the study, participated in data analysis, wrote the first version of the manuscript and is the guarantor of the manuscript, EEW: acquired and analysed the dataset and critically reviewed the manuscript. CH: participated in data analysis and critically revised the manuscript; AG: participated in study design, supervised the project and critically revised the manuscript.

  • Funding Emma Williams was supported by a grant from the Charles Wolfson Charitable Trust and a non-conditional educational grant from SLE. This research was supported by the National Institute for Health Research (NIHR) Biomedical Research Centre at Guy's and St Thomas' NHS Foundation Trust and King's College London. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health.

  • Competing interests Professor Greenough has held grants from various manufacturers (Abbot Laboratories, MedImmune) and ventilator manufacturers (SLE). Professor Greenough has received honoraria for giving lectures and advising various manufacturers (Abbot Laboratories, MedImmune) and ventilator manufacturers (SLE). Professor Greenough is currently receiving a non-conditional educational grant from SLE.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.