Article Text
Abstract
Introduction The effectiveness of rotator cuff tear repair surgery is influenced by multiple patient-related, pathology-centred and technical factors, which is thought to contribute to the reported retear rates between 17% and 94%. Adequate patient selection is thought to be essential in reaching satisfactory results. However, no clear consensus has been reached on which factors are most predictive of successful surgery. A clinical decision tool that encompassed all aspects is still to be made. Artificial intelligence (AI) and machine learning algorithms use complex self-learning models that can be used to make patient-specific decision-making tools. The aim of this study is to develop and train an algorithm that can be used as an online available clinical prediction tool, to predict the risk of retear in patients undergoing rotator cuff repair.
Methods and analysis This is a retrospective, multicentre, cohort study using pooled individual patient data from multiple studies of patients who have undergone rotator cuff repair and were evaluated by advanced imaging for healing at a minimum of 6 months after surgery. This study consists of two parts. Part one: collecting all potential factors that might influence retear risks from retrospective multicentre data, aiming to include more than 1000 patients worldwide. Part two: combining all influencing factors into a model that can clinically be used as a prediction tool using machine learning.
Ethics and dissemination For safe multicentre data exchange and analysis, our Machine Learning Consortium adheres to the WHO regulation ‘Policy on Use and Sharing of Data Collected by WHO in Member States Outside the Context of Public Health Emergencies’. The study results will be disseminated through publication in a peer-reviewed journal. Institutional Review Board approval does not apply to the current study protocol.
- ORTHOPAEDIC & TRAUMA SURGERY
- Shoulder
- Orthopaedic & trauma surgery
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
This study aims to calculate a patient-specific retear-chance after rotator cuff repair surgery.
Creating an online-available tool that predicts retear chances can help both medical professionals and patients in clinical decision-making on rotator cuff repair surgery.
Included data will be gathered from previously published databases of all authors included in the Machine Learning Consortium, aiming to include data from over 1000 patients.
This study does have the limitation of being retrospective and therefore the study is dependent on the recordkeeping of each individual hospital.
Introduction
Despite technical advances of rotator cuff repair, the rate of unhealed or re-torn rotator cuff tears remains high, with percentages ranging between 10% and 94%.1 A myriad of patient-related,2 pathology-centred3 and technical factors4 influence this adverse outcome.
Patient selection is thought to be essential, however there is no consensus on which of the numerous potentially influential factors are most important for the prediction of satisfactory postoperative results.5 Furthermore, the value of preoperative optimisation of potential patient-related influential factors including comorbidities, metabolic deficiencies and intoxications remains questionable. The increasing worldwide interest in these factors is confirmed by development of preoperative screening and optimisation programmes aiming for smoking cessation, diabetes control, use of statins in hyperlipidaemia and vitamin D deficiency supplementing.2 6 However, the majority of shoulder surgeons seems to limit decision-making to more basic, previously established predictive factors including age, functional demand and pathology-specific grading. Despite the many different classification systems that have been developed to facilitate decision-making, a patient-specific decision tool is still lacking.7 8 This, in combination with the fact that existing research commonly evaluates a single treatment option between homogenic groups, makes it almost impossible for surgeons to preoperatively indicate a reliable chance of satisfactory results.
Artificial intelligence and machine learning (ML) is believed to facilitate a more patient-specific approach and will allow us to move to the next level of evidence-based medicine: personalised patient-care. Clinical prediction tools, incorporating patient-specific factors to predict outcome probabilities will provide guidance to both clinicians and patients.9 10 Within orthopaedic (oncology) surgery, prediction tools based on ML algorithms, have already been successfully implemented to predict patient-specific 5-year survival in patients with chondrosarcoma.11 Furthermore, based on a series of 422 patients undergoing lumbar discectomy, Staartjes et al demonstrated deep learning algorithms to be superior to standard regression models in predicting patient-reported outcome measures (PROMs).9
Aim of this study
The aim of this study is to develop and train a ML algorithm in order to create a clinical prediction tool to be used in clinical practice by predicting retear-chance of the rotator cuff as well as chance of clinical improvement based on preoperative patient data. The prediction tool will be free and online available.
Methods and analysis
This is a retrospective, multicentre, cohort study.
The primary and secondary outcome measures will be implemented as features for the prediction algorithm.
Primary outcome measures
Rotator cuff retear rates at minimum 6 months follow-up as measured on MRI, arthro-CT and/or ultrasound (yes vs no, defined by Sugaya grades 1–3 as no retear and grade 4–5 as retear12).
Enduring satisfactory functional outcome defined as achievement (yes vs no) and maintenance (yes vs no) of the PROM-specific minimal clinical important difference (MCID)13 in numeric rating scales of PROMs from baseline at 2–5 years follow-up after repair. PROMs include the Constant-Murley score, American Shoulder and Elbow Surgeons score (ASES), University of California at Los Angeles shoulder score (UCLA), Oxford Shoulder Score (OSS), Western Ontario Rotator Cuff index (WORC), Disabillities of Arm, Shoulder and Hand score (DASH).
Secondary outcome measures
Adverse events graded as the possibility of none/minor versus moderate/severe complication as defined in accordance to Felsch et al.14 Adverse events classify as moderate/severe from Felsch class III onwards, which means other surgical or radiological intervention was needed or unexpected hospital admission was necessary. Adverse events will be differentiated into three groups: infection, revision surgery or other.
Model development
The development of the prediction model will be performed based on the steps described by Steyerberg and Vergouwe15:
Data collection.
Data inspection.
Coding of predictors.
Model specification.
Model estimation and performance.
Model validation.
Model presentation.
Data collection
Step one will involve contacting authors from previously published studies in order to collect and combine their (raw) individual patient data into a central database. All randomised controlled trials comparing any surgical technique, add-on biological intervention or rehabilitation protocols concerning rotator cuff surgery will be included. In addition, cohorts evaluating risk factors of surgical techniques after rotator cuff repair will be included. This retrospective review will therefore incorporate patients with all types of tears and concomitant procedures (eg, biceps tenodesis or tenotomy and acromioclavicular resection). Exclusion criteria for all studies will be the lack of postoperative evaluation by ultrasound, contrast-enhanced CT or MRI at minimally 6 months after surgery or publication date from before 2005. Relevant studies will be identified using a systematic approach primarily searching the online PubMed database according to the search terms found in online supplemental file 1. As there is no golden standard for sample size or power calculations for prediction models, and we are fully dependent on contributed data, we aim to include at least 1000 patients worldwide.15
Supplemental material
Problem definition and data inspection
All contributed data sets will be formatted into one central database. As data is commonly collected in .csv (Microsoft Excel) or .sav (SPSS) files, formatting will be performed with the dplyr package for R software. All raw data of the different variables will be separately reviewed for inaccuracies and other defects. This process will focus on uniformisation of possible inconsistencies in the collected data, for example, follow-up times into a standardised format as ‘days after surgery’. Categorical data will be translated into English or corrected for typographs. Continuous variables will be screened for outliers by visualisation in the ggplot package. Impossible values or uninterpretable syntax errors will be excluded from the central database.
Coding of predictors
For each primary outcome, a logistic regression will be performed including all available variables in the central database to identify the variables with the highest predictive values. The data points available include patient demographic (sex, age), patient-specific factors (body mass index, dominance, sport/activity level, workers compensation), pathology-specific factors (eg, tear size and location), surgical technique and add-on interventions. For a complete overview of all variables see online supplemental file 2. The variables with the highest predictive values will be used as the algorithms labels.
Supplemental material
Missing data
As the main database will comprise data from multiple studies, we expect many cases of missing data. The approach to missing data will differ depending on the type of variable. Variables with less than 5% missing data will be replaced by imputation.16 Missing data on any surgical technique or add-on intervention is expectable as interventions outside the scope of a study would not be mentioned (or briefly mentioned in the exclusions part). Therefore, this kind of missing data will be transformed to ‘No’. Overall availability of variables will be presented according to current guidelines.17 Any variances between hospitals will be reported.
Model specification
Algorithms to be trained
Based on previous studies,18 19 the following algorithms are likely to result in accurate prediction models for our primary outcomes: (1) Bayes Point Machine, (2) Boosted Decision Tree (3) Penalised Logistical Regression, (4) Neural Network and (5) Support Vector Machine. In order to recognise patterns related to each outcome, the ML algorithms will have to be trained separately for each outcome.
Model estimation and performance
Assessing the performance of the algorithms
The performance of the ML-algorithms will be assessed and compared based on (1) model discrimination; (2) calibration and (3) overall model performance (Brier Score) according to Steyerberg’s structured ‘ABCD-methodology’ for clinical prediction rules.15 20
The model’s predicted probability will be plotted against the actual observed probability to calculate calibration of a model. Perfect models will have calibration intercepts of 0, and calibration slopes of 1.27. The overall performance of the model will be assessed with the Brier Score. A perfect Brier score, indicating total accuracy, is a score of 0. The lowest possible score is a Brier Score of 1.26. Accuracy, sensitivity, specificity and area under the receiver operated characteristic-curve will be measures for a model’s ability to distinguish patients with the primary outcome from those without.
Model validation
Internal validation
Internal validation of our algorithms will be performed by 10-fold cross validation. This means that instead of dividing the main data set into one training set and one testing set, this process will be 10 times randomly repeated and the results will be averaged. This has as main advantage that all individual patient records are used as training and testing data simultaneously, which results in higher accuracy of predictions as well as lower chance of bias. The cross validation will be performed using the trainControl function from the Caret library for R.
External validation
Before incorporating the best performing algorithm, we aim to have the algorithm externally validated. The same performance metrics could be calculated as described above. However, this would involve collaboration with partners that have adequate data and are willing to share. As no agreements currently have been made, the external validation is outside the scope of this study.
Model presentation
The best performing algorithm will be deployed as an open-access probability calculator and used to design a clinical decision rule. To simulate the clinical scenario to which a decision rule would be most applicable, thresholds shall be selected based on patients with clinical symptoms of a retear or with an unsatisfactory functional outcome.
Patient and public involvement
None.
Ethics and dissemination
For safe multicentre data exchange and analysis, our Machine Learning Consortium adheres to the WHO regulation ‘Policy on Use and Sharing of Data Collected by WHO in Member States Outside the Context of Public Health Emergencies’.21 As Institutional Review Board (IRB) approval has been acquired for each of the included studies and data are anonymised as in conventional meta-analyses, additional IRB approval is not required for the current study protocol. The technical appendix, statistical code and final data set will be published with the study results.
Current status
The study has currently entered the data-collection phase, which is expected to last until March 2023. Re-evaluation of the data using ML algorithms to predict outcomes will start in April 2023, after which the algorithms can be externally validated. The expected time for study completion is by late 2023.
Discussion
Due to the wide variety of pathological factors at the origin of rotator cuff tears and the numerous surgical approaches to repair, optimal decision-making remains challenging. Smaller case series often provide heterogeneous data on this topic, however the largest and most recent meta-analysis to date including 2611 patients with a mean follow-up of 25 months has somewhat demystified the matter. Patients with a full-thickness rotator cuff retear exhibited significantly lower functional outcome scores and strength compared with patients with an intact or partially torn rotator cuff.22 This is corroborated by the findings of rotator cuff repair with more than 10 years follow-up, showing clinical superiority of structural tendon integrity in partial cuff tears.23–25 Progressive osteoarthritic changes are significantly more common in patients with repair failures.24 The most recent randomised controlled trial comparing surgical repair to conservative treatment for degenerative rotator cuff tears showed that only operated patients without retear had an improvement exceeding the MCID in functional outcome at 1-year follow-up.26 Findings from the latest meta-analysis on this comparative topic conclude that as the success rate of conservative treatment may be high, judicious selection of patients who are most likely to benefit from surgery is key.27 It is extremely difficult to combine all these factors into a clinical decision related to one specific patient. Creating a free online available clinical prediction tool that takes all these factors into account will assist physicians in selecting which patients with rotator cuff tears will benefit from a repair. In addition, the aimed size (more than 1000 patients) of the database that will be used to design and train the prediction tool might provide new insights on which biological or biomechanical factors influence outcomes after rotator cuff repair the most. Awareness of these factors would be the essential first step to incorporating them in future treatment strategies and eventually improving outcomes. The main limitation of this study is that it is a retrospective, multicentre study. This means this study is dependent on the quality of recordkeeping in the different participating hospitals. This may lead to variance in recorded variables and therefore missing data.
Ethics statements
Patient consent for publication
Acknowledgments
Olimpio Galasso, Vivek Pandey, Mats Ranebo, Martyn Snow and Riccardo d’Ambrosi have contributed by providing relevant feedback on the general design of the study.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
Collaborators Laurens J H Allaart, Sanne H van Spanning, Laurent Lafosse, Thibault Lafosse, Alexandre Lädermann, George S Athwal, Laurent A M Hendrickx, Job N. Doornberg, Michel P J van den Bekerom, and Geert Alexander Buijze
Contributors LJHA, SvS, GAB and MPJvdB contributed to the conception, overall design and planning of the study. LAMH and JND contributed to the conception and design of the methods section, primarily focusing on the machine learning section and data analysis. AL, GSA, TL and LL contributed to the design of the methods section and primarily focused on how the data should be collected and interpreted. LJHA, SvS, GAB and MPJvdB contributed to writing the protocol. All authors revised this version of the protocol and gave final approval for it to be published. All authors ensure that questions related to the accuracy or integrity of any part of this protocol are appropriately investigated and resolved.
Funding This research has received funding by the SECEC/ESSSE 2020 Research Grant as part of the project 'The Effect of Risk Factors, Surgical Technique and Biomodulation on Tendon Healing after Rotator Cuff Repair'.
Competing interests AL is a paid consultant for Arthrex, Medacta and Stryker. He receives royalties from Stryker. He is the founder of BeeMed, Med4Cast and FORE. He owns stock options from Medacta. LL is a consultant for Depuy Stryker, received royalties from Depuy. TL is consultant for Depuy Mitek and Stryker. GAB received consultancy fees from Depuy-Synthes and Research Funds from SECEC, Vivalto Santé. The remaining authors certify that neither he or she has funding or commercial associations that might pose a conflict of interest in connection with the submitted article.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.