Predictive machine learning models for ascending aortic dilatation in patients with bicuspid and tricuspid aortic valves undergoing cardiothoracic surgery: a prospective, single-centre and observational study

Objectives The objective of this study was to develop clinical classifiers aiming to identify prevalent ascending aortic dilatation in patients with bicuspid aortic valve (BAV) and tricuspid aortic valve (TAV). Design and setting A prospective, single-centre and observational cohort. Participants The study involved 543 BAV and 491 TAV patients with aortic valve disease and/or ascending aortic dilatation, excluding those with coronary artery disease, undergoing cardiothoracic surgery at the Karolinska University Hospital (Sweden). Main outcome measures Predictors of high risk of ascending aortic dilatation (defined as ascending aorta with a diameter above 40 mm) were identified through the application of machine learning algorithms and classic logistic regression models. Exposures Comprehensive multidimensional data, including valve morphology, clinical information, family history of cardiovascular diseases, prevalent diseases, demographic details, lifestyle factors, and medication. Results BAV patients, with an average age of 60.4±12.4 years, showed a higher frequency of aortic dilatation (45.3%) compared with TAV patients, who had an average age of 70.4±9.1 years (28.9% dilatation, p <0.001). Aneurysm prediction models for TAV patients exhibited mean area under the receiver-operating-characteristic curve (AUC) values above 0.8, with the absence of aortic stenosis being the primary predictor, followed by diabetes and high-sensitivity C reactive protein. Conversely, prediction models for BAV patients resulted in AUC values between 0.5 and 0.55, indicating low usefulness for predicting aortic dilatation. Classification results remained consistent across all machine learning algorithms and classic logistic regression models. Conclusion and recommendation Cardiovascular risk profiles appear to be more predictive of aortopathy in TAV patients than in patients with BAV. This adds evidence to the fact that BAV-associated and TAV-associated aortopathy involves different pathways to aneurysm formation and highlights the need for specific aneurysm preventions in these patients. Further, our results highlight that machine learning approaches do not outperform classical prediction methods in addressing complex interactions and non-linear relations between variables.


INTRODUCTION
Thoracic aortic aneurysm (TAA) is a silent disease characterised by medial degradation and pathological widening of the intrathoracic aorta.Individuals with a bicuspid aortic valve (BAV) are at higher risk of developing ascending aortic aneurysm than individuals born with a normal tricuspid aortic valve (TAV). 1 2 BAV is the most common congenital

STRENGTHS AND LIMITATIONS OF THIS STUDY
⇒ Comprehensive clinical and epidemiological data from 1034 individuals (543 bicuspid aortic valve (BAV) and 491 tricuspid aortic valve (TAV)) were analysed, providing a robust dataset for examination.⇒ Aortic valve morphology assessment during openheart surgery adds a significant strength, enhancing reliability compared with relying solely on echocardiography.⇒ However, the study has inherent limitations, including potential selection bias due to the exclusion of individuals with significant coronary artery disease, potentially leading to an overestimation of aneurysm prevalence.⇒ Being a surgical cohort, the study population may include individuals with worse outcomes, which could affect generalisability to counterparts of similar age in non-surgical settings.⇒ The study is limited to a single centre and monoethnicity, potentially impacting the generalisability of the findings to broader BAV/TAV populations.
4][5] The underlying mechanism is not known although a turbulent flow, caused by the abnormal bicuspid valve anatomy, and/or a genetic defect, has been proposed as contributing factors.The complications of BAV have led to a number of unanswered clinical questions such as: (1) what mechanisms are underlying the development of aortopathy in adults with BAV, (2) how to predict the development of aortopathy in individuals with BAV and (3) will prediction models performance and predictors be different in patients with BAV compared with patients with TAV.Echocardiography is effective in showing the degree of the sinus valsava and the annulus dilatation.However, it is more difficult to know the degree of dilation of the sinotubular junction and the ascending aorta even though they are the more frequently dilatated sections in patients with BAV. 6 Our aim was to identify clinical classifiers of prevalent aortopathy using comprehensive multidimensional clinical data.Automated machine learning methods and traditional regression models were performed on a total of 1034 subjects with either BAV or TAV.

Clinical cohorts
The Advanced Study of Aortic Pathology (ASAP) study design and description have been previously published. 1 7riefly, the ASAP cohort is a single-centre, observational cohort study of consecutive patients with aortic valve and/or ascending aortic disease undergoing elective open-heart surgery at the Cardiothoracic Surgery Unit, Karolinska University Hospital in Stockholm, Sweden.Inclusion criteria were patients aged 18 or above with aortic valve disease (ie, aortic stenosis (AS) or aortic regurgitation (AR)) and/or ascending aorta dilatation (aneurysm or ectasia) but devoid of significant coronary artery disease (stenosis>70% or fractional flow reserve <0.80) and primarily not planned for another concomitant valve surgery.The Disease of the Aortic Valve Ascending Aorta and Coronary Arteries (DAVAACA) study was set up in the continuity of the ASAP study.The ASAP/DAVAACA study started in February 2007 and includes multidimensional data (blood analyses, genetic and clinical data, family history of cardiovascular diseases (CVDs), prevalent diseases data, demographic characteristics, lifestyle habits data and medication).In the present study, 1180 operated patients were included (28 June 2017).Exclusion of patients with unicuspid aortic valves, patients with missing data for cuspidity and aortic dilatation and patients with syndromic forms of TAA resulted in a final study population of 1034 subjects (online supplemental efigure 1).Data collection and classification of AS and AR have been previously described. 110] Echocardiography Preoperative transthoracic echocardiography was used to determine aortic valve function.Transoesophageal echocardiography (TEE) was performed on the operating table, prior to surgery, under general anaesthesia and as previously described. 11The echocardiographic evaluation has been already published elsewhere. 1TEE evaluation was used to assess valvular function, aortic root morphology and ascending aortic diameter; measurements were performed according to standards outlined by the ASE.Diameter of the ascending aorta was measured at several locations (annulus, sinus Valsalva, STJ).An ascending aorta with a diameter exceeding 40 mm was categorised as aortic dilatation in both women and men.Similarly, patients with root dilatation exceeding 40 mm were classified as having aortic dilatation.Body surface area (BSA) was calculated using the Du Bois method. 12S and AR were defined according to standard guidelines for valve surgery. 13

Aortic valve morphology
The morphology of the aortic valve was determined by inspection of the valve during surgery.Based on appearance, the valve was classified according to number of cusps and commissures.Three cusps and three commissures denote a TAV; two cusps and two commissures denote a BAV (if a remnant commissural raphe was present) or true BAV (if no raphe was present).The BAV was further classified according to cusps fusion; right and left coronary cusps (RL), right and non-coronary cusps (RN) or left and non-coronary cusps (LN). 1 The Sievers classification, 14 based on intraoperative analysis of the aortic valve morphology, was used to diagnose and type the BAV.

Statistical analysis
Characteristics of the population were described using analysis of variance or χ 2 tests when appropriate.Several meaningful blocks of variables were chosen (clinical data, family history of CVDs, prevalent diseases, demographic characteristics, lifestyle habits data and medication).Principal component analysis (PCA) and multiple correspondence analysis were used to explain the variance-covariance structure of the variables combined.To identify variables that separated dilated versus nondilated BAV and TAV patients, logistic regression models were used and automated machine learning algorithms were applied.

Logistic regression
For logistic regression, a set of variables were selected to be included in the model.Since model performance can rely heavily on variable selection, different variable selection methods were tested prior to the logistic regression analysis (online supplemental efigure 2).First, variables were selected based on prior knowledge and/or biological plausibility.Then, two automated variable selection Open access methods were considered: (1) backward and forward elimination to optimise Akaike Information Criteria, 11 and (2) and least absolute shrinkage and selection operator (LASSO). 12

Machine learning algorithms
We used two machine learning algorithms, Random Forests and Artificial Neural Network, which are among the most widely and successfully used for clinical data. 15 16ach one of them represents a different algorithm 'family', with different internal algorithm structures. 17Since it was not known beforehand which kind of algorithm would perform best, algorithms with different internal structures were chosen to increase the probability of good discriminative performance.A 10-fold cross-validation method to our logistic regression was also applied, allowing us to split the training data into five subsets.Then, five times iterations over the five subsets were performed, such that the subset with the index same as the iteration number was used as the validation test and the remaining four subsets were training tests.Sensitivity, specificity and positive and negative predictive values for each predictive model were also assessed, applying a leave-one-out cross-validation method (online supplemental efigures 3 and 4).

Patient and public involvement
It was not possible to involve patients or the public in the design, or conduct, or reporting, or dissemination plans of our research.

Characteristics of BAV and TAV patients
The study population included a total of 1034 patients (52% BAV and 48% TAV) (online supplemental etable 1 and efigure 1).BAV patients were significantly younger (60.4 (SD 12.40) years versus 70.4 (SD 9.09) years in TAV patients), and mainly men (73.1%) compared with TAV patients (63.3%).Furthermore, BAV patients had significantly higher BSA but, in contrast, a lower body mass index (BMI).AS was more common in BAV compared with TAV whereas no significant difference in the proportion of AR was found, even if the prevalence of AR was slightly higher in TAV patients.Finally, a higher prevalence of CVD history was observed in TAV individuals (hypertension, coronary artery disease, heart failure and stroke) and a higher percentage of family history of CVD (sibling history of myocardial infarction before 65 years of age).

Aortic dilatation in BAV and TAV patients
Globally, PCA showed no separation among BAV patients with (BAV-D) or without (BAV-ND) ascending aortic dilatation (figure 1A), using general clinical characteristics, family history of CVD, prevalent diseases and demographics as input.By contrast, TAV patients with a dilated (TAV-D) ascending aorta clearly separated from TAV individuals with non-dilated (TAV-ND) ascending aortas (figure 1B), indicating a more pronounced difference between these groups.
Analysis of patient descriptive data showed a pattern of significantly different associated traits between BAV-ND versus BAV-D and TAV-ND versus TAV-D (table 1, online supplemental etable 2) patients.Specifically, TAV-D patients were younger than TAV-ND patients while in the BAV group age did not differ.BAV-D patients, on the other hand, had a higher BSA and diastolic blood pressure, and a lower pulse pressure (PP) than BAV-ND patients.In both BAV and TAV patients, AR was more commonly associated with aortic dilatation, while AS was more prevalent in patients with non-dilated aorta.It was also noted that aortic dilatation was more localised to the tubular part of the ascending aorta in BAV patients whereas in TAV patients the dilatation more often included the aortic root (table 1).A positive association between dilatation and abdominal aortic aneurysm (AAA), and negative associations between dilatation and diabetes, and dilatation and angina were found in both BAV and TAV patients (table 1).
To further analyse BAV-associated and TAV-associated aortopathy, a multivariable logistic regression model was performed, including age, sex, BSA, low-density lipoprotein (LDL) cholesterol and high-sensitivity C reactive protein as covariates (table 2).Independent variables were chosen based on previous analyses with significant differences within groups.Covariates were selected based on stepwise selection and decision trees.The results showed that in BAV patients, AS, PP and diabetes were negatively associated with dilatation (adjusted models for age, sex, BSA, LDL and hsCRP were with AS: OR=0.44 (0.28 to 0.70), p=0.001;PP: OR=0.99 (0.98 to 1.0), p=0.025, and diabetes: OR=0.32 (0.16 to 0.62), p=0.001).Similarly, in TAV patients, dilatation was negatively associated with AS: OR=0.03 (0.02 to 0.06), p<0.001 and diabetes OR=0.11 (0.03 to 0.32), p<0.001.The magnitude of the association with AS was however 15-fold higher in TAV than in BAV patients.AR was positively associated with aortic dilation in both BAV and TAV with a fivefold higher magnitude in TAV compared with BAV patients (table 2).The ROC curve for the risk prediction model across the predictive methods without AS as predictor is shown in online supplemental efigure 5.

Clinical classifiers of aortic dilatation
In order to answer the question if the prevalence of aortopathy in TAV and BAV patients may be identified using clinical characteristics, automated machine learning algorithms, including random forests and artificial neural network, and classic logistic regression models were used.This showed that the discrimination between TAV-ND and TAV-D patients was good for all models tested, with mean AUCs ranging from 0.81 to 0.88 (figure 2, online supplemental etable 3).The best discrimination was obtained using Random Forest (mean AUC: 0.88 (95% CI 0.78 to 0.96)) and GLM (mean AUC: 0.81 (95% CI 0.68 to 0.93)) (figure 2, online supplemental etable 3).
Generalised regressions included AS, AR, PP, prevalence of diabetes, age, sex, BSA, LDL, hsCRP and BMI as variables.It was not possible to predict aortic dilatation in BAV individuals that are at high risk of developing aneurysm, as indicated by poor discrimination by all classifications models used (figure 2, online supplemental etable 3).This is in line with the lack of separation in the PCA.
Of note, the robustness of our classification models is supported by the results of the sensitivity analyses using imputed data (online supplemental etables 4 and 5).Online supplemental etable 5 indicates that the main characteristics between subjects with and without missing data did not differ substantially, minimising selection bias.

DISCUSSION
In our analysis of 1034 Swedish subjects with BAVs and TAVs with and without prevalent aortopathy but devoid  Open access of coronary artery disease and primarily not planned for another concomitant valve surgery, we report two key findings.First, the prediction of aortic dilatation using automated machine learning methods and traditional regression models on multidimensional clinical data was possible only among TAV individuals, and not in with BAV.This suggests that general clinical cardiovascular risk profiles play more important roles  Open access during aortic dilation in TAV patients than in patients with a BAV, and further supports that aortopathy associated with BAV and TAV, respectively, is clearly distinct with different underlying aetiologies.Second, our study shows that the classification results were consistent for all machine learning algorithms and classic logistic regression models.This suggests that machine learning approaches might not outperform classical prediction methods in addressing complex interactions and non-linear relations between variables.This unexpected finding that classical statistics outperforms machine learning merits further exploration.This may be attributed to the distinct attributes and applications of each approach.Classical statistical methods, grounded in well-established mathematical theories, offer interpretability and inferential power, excelling in scenarios where assumptions are met and relationships between variables are linear or follow specific parametric forms.Conversely, machine learning techniques, leveraging algorithms to identify patterns in complex datasets, provide flexibility and the ability to capture non-linear relationships.They are adept at handling high-dimensional data, especially when the underlying data structure is intricate or not well defined.The strongest predictor for ascending aortic dilatation in TAV was the absence of AS.The other contributors to the prediction of aortic dilatation in TAV are shown in online supplemental etable 6.This is in accordance with our previous observations that surgical patients with AS and ascending aortic dilatation almost exclusively have a BAV. 18Whether this implies that biological processes associated with the development of AS in TAV also contribute to ascending aortic stability needs to be further elucidated.AS is commonly caused by progressive calcification of the aortic valve and increase in prevalence with age.In our cohort, TAV patients with dilated ascending aortas were significantly younger than TAV patients with non-dilated ascending aortas, which may be one contributing factor to the observed association and possibly reflects the surgical nature of the cohort.However, omitting patients with AS from the analysis still resulted in significant predictive values, although with worse discrimination, indicating that age alone cannot explain the association.Other strong contributors to the prediction model were diabetes and hsCRP.A negative association between diabetes and aneurysm in the abdominal aorta is well documented. 19It has also been suggested that metformin prescription may associate with decreased risk of aortic dilatation and that the molecular mechanism involves a metformin-induced reduction in aortic inflammation. 20 21Similarly, an elevated hsCRP has been described in AAA patients and found to be an independent risk factor for AAA. 22 23Moreover, we have previously shown an inflammatory gene expression profile in dilated aortic tissue from TAV but not BAV patients. 7In the present study, there was a borderline significance of reduced hsCRP levels in BAV patients with dilated aorta.This together implies that aortic dilatation in TAV, in these aspects, may be more similar to aneurysm of the abdominal aorta than BAV aortic dilatation.

Open access
The lack of a good model for risk prediction of BAVassociated aortopathy raises the question, which other contributing factors may be of importance for aneurysm formation and development in these individuals?Two main hypotheses have been put forth in the literature.
First, an altered flow in the proximal part of the ascending aorta due to the valve malformation itself has been suggested to provoke aortic dilatation.Also, different BAV morphotypes, that is, cusp fusion pattern, have been shown to cause flow disturbances that affect the aorta in morphotype-dependent ways. 24In our study, flow characteristics were not included in the prediction model, which possibly could have influenced the results.However, we could not see any difference in the presence of aortic dilatation between different morphotypes, that is, true BAV, LN, RN or RL cusp fusion (table 1).Of note, other factors may also influence aortic flow patterns, for example, eccentricity of valve opening due to valve disease, or vessel stiffness.Indeed, it has been suggested that AS significantly alters aortic haemodynamic and wall shear stress, independent of aortic valve phenotype.
Second, the genetic contribution may override the influence of traditional risk factors to aneurysm development in patients with BAV.Although specific gene(s) and/or mutation(s) underlying BAV and BAV-associated aortopathy are still to be unravelled, several genes have been shown to be associated with both BAV formation and concomitant aortopathy in mice and humans (p2 of Ref 25). 26 27Moreover, a high heritability of BAV and/ or other cardiovascular malformations have been demonstrated using segregation patterns in families, with a heritability of BAV and BAV together with other cardiovascular malformations being as high as 89% and 75%, respectively. 28A third, most likely, possibility is that both genetic factors and abnormal haemodynamic burden play central roles in BAV-associated aortopathy, interacting with each other and thereby contributing to aortic dilatation.
Further dissecting differences between patients with non-dilated and dilated ascending aortas we found that, among others, PP, AS, AR and diabetes were associated with dilatation in BAV.PP is a well-known risk factor for CVD and the clinical manifestation of increased vascular stiffness. 29Surprisingly, PP was higher in BAV patients with a non-dilated aorta, which may seem counterintuitive since AR is associated with both increased PP and aortic dilatation.However, it may be speculated that the higher PP seen in these patients may rely on structural changes due to an increased haemodynamic burden associated with a BAV.In line with this, we have previously shown that BAV patients have a qualitative collagen defect in their ascending aorta, signified by a different collagen glycation compared with TAV patients and suggestive of an altered non-enzymatic collagen crosslinking. 30Interestingly, we have also shown that dilated ascending aorta of BAV patients display an increased collagen-related Open access stiffness compared with TAV patients. 31Furthermore, The Strong Heart Study could also show that in patients free of prevalent coronary heart disease, aortic root dilatation was, at a given diastolic blood pressure and stroke volume, associated with lower PP. 32In our study, Angina pectoris appears to be more prevalent in individuals with nondilated aortas.This observation can be elucidated by the exclusion criterion of significant coronary artery stenosis, as defined in our study.Despite this exclusion criterion, angina is reported as a symptomatic manifestation.It is noteworthy that angina may be symptomatic of AS or left ventricular hypertrophy induced by AS.The latter condition is more frequently encountered in isolation rather than in combination with proximal aortic dilatation.Additionally, our findings indicate a higher prevalence of angina, as well as other cardiovascular conditions (such as stroke, previous myocardial infarction, etc), and the use of medications in the group with TAVs, where the patients are older compared with the BAV group.
Additionally, accelerated vascular ageing and increased arterial stiffness have previously been described in patients with diabetes, the proportion of which was higher among BAV patients with non-dilated aortas. 33However, we and others have previously demonstrated an increased vascular inflammation in dilated aorta in TAV but not BAV patients, suggesting that other mechanisms could be involved in the protective role of diabetes on aneurysm formation in BAV patients. 7 34The association between dilatation and valve disease was not as pronounced in BAV as in TAV patients, although AS was also negatively associated with dilatation in BAV, as previously found. 35It may be speculated that the presence of AS increases flow velocities and blood pressure in the ascending aorta, thereby stimulating vascular remodelling and strengthening of the aortic wall.Whether this hampers the process of dilatation remains to be answered.The relation between degree of stenosis and width of the ascending aorta is complex, and a previous study found mid-ascending dilatation proportional to valve gradient when patients with small aortas were excluded. 35ur findings raise the issue of how to identify and implement prevention of aortopathy in BAV patients in a clinical setting.So far, clinicians have focused on aortic valve function and aortic dimensions to indicate cardiac surgery and recommend annual follow-up in asymptomatic patients to screen for associated aortopathy. 36High importance has been given to the morphology of the valve, although in our study, dilatation in BAV did not show any significant association with valve morphology.Of note, previous studies establishing an association between BAV cusp fusion and clinical outcomes relied on small sample size based on imaging diagnostic rather than anatomic diagnosis. 37

Study strengths and limitations
In this study, comprehensive clinical data, including blood sampling as well as epidemiological data, were used in the analysis of in total 1034 individuals (543 BAV and 491 TAV).The morphology of aortic valves was evaluated by visual inspection during open-heart surgery, which is a major strength compared with only echocardiography in terms of reliability.
A few limitations must however be highlighted.First, by design, only individuals devoid of significant coronary artery disease were included, which may introduce a selection bias and an overestimation of the prevalence of subjects with aneurysm.Second, as this is a populationbased surgical cohort, it is possible that our study population included BAV and TAV patients with worse outcomes compared with their counterparts of similar age.This should however not affect the associations between valve type and aortic dilatation.Third, TAV patients with nondilated aortas were significantly older than TAV individuals with a dilated aorta, which could possibly explain the higher degree of patients with AS in this group.Lastly, our study is a single-centre and monoethnicity study.Therefore, our results may not be generalisable to other population of BAV/TAV.
To conclude, using automated machine learning algorithms and classic logistic regression models, we demonstrated that in TAV patients, cardiovascular risk profiles appear to be more predictive of aortopathy than in BAV patients.The good performance of the TAV classifier also after exclusion of AS offers important implications for better targeting TAV individuals who are of a high risk of developing aneurysm.The lack of good models to develop clinical classifiers of BAV-associated aortopathy strengthens the focus of genetics and/or flow as important contributing factors to aneurysm development in these individuals.

Figure 2
Figure 2 The ROC (receiver operating characteristic) curve for risk prediction model across the predictive methods used.BAV, bicuspid aortic valve; TAV, tricuspid aortic valve.

Table 1
Baseline characteristics of with bicuspid aortic valve (BAV) or tricuspid aortic valve (TAV) (dilated vs nondilated ascending aorta) χ 2 test was used for categorical variables; analysis of variance for normal-distributed continuous variables; Kruskal-Wallis test for non-normal continuous variables.AAA, abdominal aortic aneurysm; BMI, body mass index; BSA, body surface area; hsCRP, high-sensitive C reactive protein; LDL, low-density lipoprotein; STJ, sinotubular junction.

Table 2
Predictors of ascending aortic dilatation in bicuspid aortic valve (BAV) and tricuspid aortic valve (TAV) patients, separately *OR and 95% CI limits were obtained by logistic regression.