Article Text

Protocol
Clinical characteristics and outcomes of patients with post-stroke epilepsy: protocol for an individual patient data meta-analysis from the International Post-stroke Epilepsy Research Repository (IPSERR)
  1. Nishant K Mishra1,
  2. Patrick Kwan2,
  3. Tomotaka Tanaka3,
  4. Katharina S Sunnerhagen4,
  5. Jesse Dawson5,
  6. Yize Zhao6,
  7. Shubham Misra1,
  8. Selena Wang6,
  9. Vijay K Sharma7,
  10. Rajarshi Mazumder8,
  11. Melissa C Funaro9,
  12. Masafumi Ihara3,
  13. John-Paul Nicolo2,10,
  14. David S Liebeskind8,
  15. Clarissa L Yasuda11,
  16. Fernando Cendes11,
  17. Terence J Quinn5,
  18. Zongyuan Ge2,
  19. Fabien Scalzo8,12,
  20. Johan Zelano4,13,
  21. Scott E Kasner14
  22. International Post-Stroke Epilepsy Research Consortium (IPSERC)
    1. 1Department of Neurology, Yale University School of Medicine, New Haven, Connecticut, USA
    2. 2Department of Neuroscience, Monash University, Clayton, Victoria, Australia
    3. 3Department of Neurology, National Cerebral and Cardiovascular Center, Suita, Japan
    4. 4Department of Clinical Neuroscience, University of Gothenburg, Goteborg, Västra Götaland, Sweden
    5. 5Institute of Cardiovascular and Medical Sciences, University of Glasgow, Glasgow, UK
    6. 6Department of Biostatistics, Yale University School of Public Health, New Haven, Connecticut, USA
    7. 7Division of Neurology, Yong Loo Lin School of Medicine, National University of Singapore, National University Health System, Singapore
    8. 8Department of Neurology, University of California Los Angeles, Los Angeles, California, USA
    9. 9Harvey Cushing/John Hay Whitney Medical Library, Yale University, New Haven, Connecticut, USA
    10. 10Department of Neurology and Medicine, The University of Melbourne, Royal Melbourne Hospital, Melbourne, Victoria, Australia
    11. 11Department of Neurology, School of Medical Sciences, University of Campinas – UNICAMP, Campinas, SP, Brazil
    12. 12Department of Computer Science, Pepperdine University, Seaver College, Malibu, California, USA
    13. 13Department of Neurology, Sahlgrenska University Hospital, Goteborg, Sweden
    14. 14Department of Neurology, The University of Pennsylvania, Philadelphia, Pennsylvania, USA
    1. Correspondence to Professor Nishant K Mishra; nishant.mishra{at}yale.edu

    Abstract

    Introduction Despite significant advances in managing acute stroke and reducing stroke mortality, preventing complications like post-stroke epilepsy (PSE) has seen limited progress. PSE research has been scattered worldwide with varying methodologies and data reporting. To address this, we established the International Post-stroke Epilepsy Research Consortium (IPSERC) to integrate global PSE research efforts. This protocol outlines an individual patient data meta-analysis (IPD-MA) to determine outcomes in patients with post-stroke seizures (PSS) and develop/validate PSE prediction models, comparing them with existing models. This protocol informs about creating the International Post-stroke Epilepsy Research Repository (IPSERR) to support future collaborative research.

    Methods and analysis We utilised a comprehensive search strategy and searched MEDLINE, Embase, PsycInfo, Cochrane, and Web of Science databases until 30 January 2023. We extracted observational studies of stroke patients aged ≥18 years, presenting early or late PSS with data on patient outcome measures, and conducted the risk of bias assessment. We did not apply any restriction based on the date or language of publication. We will invite these study authors and the IPSERC collaborators to contribute IPD to IPSERR. We will review the IPD lodged within IPSERR to identify patients who developed epileptic seizures and those who did not. We will merge the IPD files of individual data and standardise the variables where possible for consistency. We will conduct an IPD-MA to estimate the prognostic value of clinical characteristics in predicting PSE.

    Ethics and dissemination Ethics approval is not required for this study. The results will be published in peer-reviewed journals. This study will contribute to IPSERR, which will be available to researchers for future PSE research projects. It will also serve as a platform to anchor future clinical trials.

    Trial registration number NCT06108102

    • Stroke
    • Epilepsy
    • Systematic Review
    • Patient Reported Outcome Measures
    • Prognosis
    http://creativecommons.org/licenses/by-nc/4.0/

    This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

    Statistics from Altmetric.com

    Request Permissions

    If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

    STRENGTHS AND LIMITATIONS OF THIS STUDY

    • This individual patient data meta-analysis (IPD-MA) attempts to characterise the outcomes and predictors in post-stroke epilepsy (PSE).

    • Using an IPD-MA approach, we will conduct more accurate and comprehensive analyses and explore interactions between covariates and outcomes.

    • The International Post-stroke Epilepsy Research Repository (IPSERR) aims to standardise data collection, define common elements, and serve as a framework for future PSE research.

    • Inconsistencies across studies regarding which variables are reported and how they are reported may limit our ability to assess the impact of potential predictors and outcome measures.

    • There may be some inclusion bias if we cannot obtain IPD for all eligible studies.

    Introduction

    Cerebrovascular disease accounts for approximately 50% of epilepsies in older adults.1 After a stroke, some individuals’ brain suffers from active epileptogenic processes, eventually leading to seizures.2–4 Post-stroke epilepsy (PSE) is associated with increased morbidity, including cognitive decline, dependence and poor quality of life, and is a critical determinant factor of stroke prognosis.5–10 Unfortunately, no proven antiepileptogenesis drugs are available to inhibit the post-stroke epileptogenic process.11–15 Because the research related to PSE is scattered worldwide, there is a critical need to integrate these efforts. We, therefore, founded an International Post-stroke Epilepsy Research Consortium (IPSERC).16 IPSERC is a community of researchers with variable (and complimentary) expertise across the globe to conduct adequately powered studies to understand and prevent PSE.16 We aim to collate and categorise data reported by PSE researchers and thus further the IPSERC’s mission, that is, to bring together, under one umbrella, large PSE patient datasets which would otherwise remain dormant within research groups scattered across the world. Therefore, we are building a repository of PSE patient data: the International Post-stroke Epilepsy Research Repository (IPSERR). IPSERR will consider all aspects of PSE research and record and house retrospective and prospective data.

    We recently reported a meta-analysis of 71 studies, including 20 110 patients with post-stroke seizures (PSS) and 1 166 085 patients without. We tested the association of outcomes, including mortality, poor functional outcomes, disability, recurrent stroke, and dementia risk in patients with PSS compared with patients without PSS. We found that patients with PSS suffer from greater mortality risk (OR 2.1; CI 1.8 to 2.4), poor functional outcomes (OR 2.2; CI 1.8 to 2.8), greater disability (SMD 0.6; CI 0.4 to 0.7), and increased dementia risk (OR 3.1; CI 1.3 to 7.7).17 We, however, observed disparate methods used by the individual investigators. We also noted variations in study reporting. We could not determine the response to antiseizure medications and the risk of drug resistance in PSS patients. Epilepsy outcomes were also not reported. We propose an individual patient data meta-analysis (IPD-MA) to tackle these limitations.

    We report the design of IPSERR in which we will lodge data on PSE patients to support collaborative research. Using the data lodged within the IPSERR, we will standardise data collection, define common data elements and outcome measures for PSE research and develop criteria for standardised reporting of PSE research. We will apply the IPD approach to conduct a meta-analysis by inviting the authors of 71 studies we investigated in our meta-analysis for data sharing.17 We will also invite IPSERC collaborators to share prospective and retrospective data from their centres. IPD analyses have significant merits compared with aggregate data meta-analysis: one can collect original, published or unpublished data from the eligible primary studies, use a consistent unit of analysis, and assess interactions between covariates and outcomes. Our goal is to build IPSERR and conduct an IPD-MA using the data lodged within IPSERR. The primary objectives of the IPD-MA are (1) to determine epilepsy, functional, and cognitive outcomes in patients who develop PSS and (2) build and validate the PSE prediction models and compare performance against existing models. This protocol outlines the design of the IPD-MA.

    Methods and analysis

    Study design

    This IPD meta-analysis will adhere to the Preferred Reporting Items for Systematic Review and Meta-Analyses of individual participant data (PRISMA-IPD) guidelines.18 This protocol was written in accordance with the PRISMA-Protocol (PRISMA-P) statement.19 We have pre-registered this IPD meta-analysis on PROSPERO.20 We have registered the protocol for IPSERR on clinicaltrials.gov (NCT06108102).

    Systematic review to identify eligible papers

    Eligibility criteria

    Types of studies

    We will include observational studies of cohort and case-control design.

    Types of participants

    We will include studies of stroke patients aged ≥18 years, with ischaemic or haemorrhagic stroke, presenting early or late PSS with data on patient outcome measures. In addition, we will include only studies published on human subjects and will not apply any restriction based on the date or language of publication, gender or ethnicity. We will exclude studies of patients with a prior history of seizures before the index stroke, studies that did not report outcome data or studies that are not able to share IPD.

    Types of exposures

    Presence of stroke, either ischaemic or haemorrhagic (intracerebral and subarachnoid).

    Types of comparators

    Studies include patients without seizure/epilepsy post-stroke.

    Types of outcome measures

    We will collect the following patient outcome measures: (1) seizure frequency; seizure severity (e.g., impaired awareness (yes/no), bilateral tonic-clonic (yes/no), occurrence or frequency of status epilepticus and hospitalisation); (2) aspects of outcome beyond seizure frequency (core outcome set), for example, patient-reported quality of life, cognitive function, treatment side effect, response to antiseizure medication, patient independence, mood (depression/anxiety), felt sigma and economic cost; (3) modified Rankin Scale (mRS) score; causes of mortality; National Institutes of Health Stroke Scale (NIHSS), Glasgow Coma Scale (GCS); home time; and length of hospital stay. Whereas the existing and previous datasets may not have epilepsy outcome data, we will invite the collaborators to collect them in the prospectively enrolled patient sample.

    Types of predictor/moderator variables

    The investigators will provide deidentified patient data on valid predictors of PSE. We will collect the following predictor/moderator variables data from individual studies: stroke subtype (ischaemic, haemorrhagic); stroke mechanism; stroke location; date of the first stroke of the patient; date of subsequent stroke of the patient; demographic characteristics (age, sex, race, Body Mass Index); systolic and diastolic blood pressure; vascular risk factors (hypertension, diabetes mellitus, hyperlipidaemia, myocardial infarction, atrial fibrillation, peripheral arterial disease, heart failure with reduced ejection, previous stroke, smoking, alcohol, other recreational drugs); history of depression, anxiety, and dementia; comorbidities (chronic liver disease, chronic kidney disease, malignancies); antithrombotic medications; antihypertensive medications; statin use; presence of seizure; seizure subtypes; recurrent seizures; time to first seizure onset post-stroke; antiseizure medications; follow-up duration; laboratory investigations (glucose, HbA1c, haemoglobin, WBC, platelets, INR, lipid profile, blood urea, creatinine, estimated glomerular filtration rate, sodium, potassium, C-reactive protein, procalcitonin); acute infection data; cognitive impairment data; imaging data; EEG data; tPA/endovascular eligibility; tPA data; endovascular therapy data; haemorrhagic transformation; any surgery done; superficial siderosis; microbleeds; ICH score; infarct volume; haematoma volume; intraventricular extension; midline shift; herniation; and hydrocephalus.

    Timing of outcome measures

    Studies will be stratified into those reporting seizures ≤1 week and those reporting seizures after 1 week to record the time of outcome assessment, as this may be recorded differently between the studies. The outcome measures will be recorded for all participants at the last patient follow-up, regardless of variation in follow-up duration across studies.

    Search for study identification and selection

    We utilised a comprehensive search strategy and searched MEDLINE, Embase, PsycInfo, Cochrane, and Web of Science databases for eligible observational studies. In addition, we used the following MeSH or free text terms for searching the relevant articles: ‘seizure’, ‘epilepsy’, ‘convulsions’, ‘epileptogenesis’, ‘late-onset’, ‘early-onset’, ‘stroke’, ‘ischaemic stroke’, ‘cerebral ischaemia’, ‘haemorrhagic stroke’, ‘intracerebral haemorrhage’, ‘prognosis’, ‘outcomes’, ‘mortality’, and ‘cardiovascular’. The detailed search strategy is reported in online supplemental file 1. We reviewed the references in the studies and review articles dedicated to this subject. Finally, we pooled the results in EndNote, deduplicated them and uploaded them to the Covidence software for screening.

    We conducted the last search for this IPD on 30 January 2023. Seven reviewers independently performed the title and abstract screening. Subsequently, we screened the full-text articles for inclusion. We resolved conflicts through discussion and consultation with the chief investigator (NKM).

    Quality assessment

    Two independent authors will assess the methodological quality of the studies included in this systematic review using the Joanna Briggs Institute (JBI) tool.21 We will use data from only the published papers to conduct the risk of bias assessment to be consistent across all studies that can share data and those that cannot.

    IPD data collection and aggregation

    Invitation to authors

    For this IPD-MA, the corresponding authors of all the eligible studies and the IPSERC members will be emailed and invited to share their data. The invitation letter includes details regarding the IPSERR, the study proposal, the objectives of the IPD-MA, variables, outcome measures of interest, and a request to participate in the IPSERC and share their IPD for the IPSERR. We will make a second attempt if the corresponding author fails to respond within 4 weeks. If this second attempt is unsuccessful, we will contact another senior author. We will repeat this process until we have reached at least three authors. If none of the authors respond or if they indicate that the data is unavailable or inaccessible due to restrictions, we will make a note that the study data is unavailable. We will send a maximum of four reminders before considering the study unavailable. We will provide co-authorship to each collaborator that shares their IPD for the IPSERR.

    Data checking and integrity

    After accepting the invitation to participate and share the IPD, the participating centres will obtain legal approvals for human subject research as per the requirements of their jurisdiction. The participating authors will share their data via a secure data transfer platform. We will use the Yale Centre for Research Computing (YCRC) resources to manage the IPD. We will request deidentified data from the authors and store it in password-protected files on Yale University’s REDCap, secured behind the university firewalls.

    We will cross-check the basic descriptions of the variables with the published reports to ascertain the reliability of the provided data. We will contact the study authors for clarification if any inconsistencies are identified, such as missing data or extreme values. To determine whether the data can be combined, we will conduct a clinical review of the data provided. For this, we will review the patient demographics, risk factors, outcome measures, and timing of outcome measures (length of follow-up). We will evaluate the histograms of variables of interest from each dataset to evaluate the measures of central tendency and data skewness. We will also test if these properties are consistent across the studies. More specifically, we will focus on the consistency of the basic demographics and disease spectrum across the participating studies.

    Database creation and data aggregation

    We will use REDCap to generate a spreadsheet template containing study characteristics, predictor variables, and outcome data. We will complete the coding for the database on receiving all data from the study authors. In cases where a study’s coding significantly deviates, two researchers will collaborate and reach a consensus on recoding and seeking clarification from the authors and data contributors when necessary. If discrepancies arise, a third team member will be consulted. After verifying and standardising the data, it will be combined into the final analysis file. All individual study datasets will be merged to create a comprehensive IPD dataset. Once the data are merged, a researcher from the study team will recheck its accuracy.

    Statistical analysis

    Individual patient data meta-analysis

    To determine the outcomes and predictors of PSE, we will conduct one-stage and two-stage random-effects IPD-MA using R. Our primary IPD-MA method will be the one-stage random-effects approach due to its ability to accommodate more advanced modelling of predictors/moderators. Furthermore, a one-stage approach is recommended for dichotomous outcomes with rare events such as PSE.22 The presence of statistical heterogeneity will be calculated using I2. An I2 value of 0% indicates no heterogeneity, 25% indicates low heterogeneity, 50% indicates moderate heterogeneity, and 75% indicates high heterogeneity.23 Considering the clinical heterogeneity observed among the included studies, such as differences in the study population, variable length of follow-up, and variable seizure definitions, a random-effects meta-analysis was selected to account for potential statistical heterogeneity. We will utilise the DerSimonian and Laird estimator24 to combine the results and apply the Hartung Knapp-Sidik-Jonkman correction to address the uncertainty.25 The results will be pooled using Odds Ratio (OR), Hazard Ratio (HR), or Standardised Mean Difference (SMD) and 95% Confidence Intervals (CI).

    Aggregate data meta-analysis

    We will examine the potential for inclusion bias by reporting the characteristics of eligible studies for which IPD was requested but not obtained. If the desired results are available for these studies, we will conduct a two-stage meta-analysis using R and incorporate these results with those where IPD was obtained. Publication bias will be estimated using funnel plots and quantitatively assessed using Egger’s regression test.

    Regression-based prediction models

    When building PSE and outcome prediction models, we will use a multilevel mixed-effects model, using ‘study’ as one of the levels. Prognostic covariates will be added to the model based on statistical consideration; however, if we encounter a situation where many possible covariates could be added, we will only incorporate the clinically relevant covariates. We will build prediction models using the conventional forward/backward logistic regression multivariable analysis. We will test the degree of multicollinearity between the clinical covariates using the Variance Inflation Factor (VIF). We will remove highly correlated variables with VIF value >2.5 from the final model.

    Artificial intelligence-based prediction models

    Machine learning approaches readily integrate many features, in contrast to conventional predictive models that only employ a small number of variables for computation.26 Few studies have incorporated prediction models, including SeLECT27 and CAVE28 scores, using the conventional logistic regression analysis for predicting epilepsy post-ischaemic and haemorrhagic strokes. However, machine learning can potentially improve SeLECT29 and CAVE’s inadequate sensitivity and specificity. We will use four machine learning algorithms to build prediction models, including support vector machine,30 random forest,31 deep neural network,26 and gradient tree boosting,32 and the conventional logistic regression models. We will include the clinical predictor variables in the machine learning prediction models. The machine learning models will be trained with all clinical variables as inputs, and the study population will be classified into (a) patients with PSE and patients without PSE for identifying the predictors associated with PSE and (b) patients with PSE having poor outcomes (epilepsy outcomes, functional outcomes, mortality, etc.) and patients without PSE having poor outcomes.

    Specifically, for the deep neural network model, we will construct a design that includes a dropout layer. This approach is intended to facilitate the modelling of the Permutation Feature Importance, a metric that quantifies the increase in the model’s prediction error after the feature’s values are permuted, thereby disrupting the relationship between the feature and the true outcome. This strategy can help identify and isolate the most impactful features, contributing to the overall performance and accuracy of the prediction model.

    Model comparison

    We will assess the performance of the various prediction models built using machine learning algorithms or conventional logistic regression, using the receiver operator characteristic (ROC) curves. We will conduct a full model comparison between the machine learning and conventional regression models. We will calculate the area under the curve (AUC) for each model, and the best cut-off point will be determined using the maximum value of the Youden Index (sensitivity+specificity−1). We will calculate each prediction model’s sensitivity, specificity, positive predictive value, negative predictive value, and likelihood ratio. The AUC between various prediction models will be compared using the Hanley-McNeil test.33 The goodness of fit between the prediction models will be assessed using the Likelihood Ratio (LR) test.34 The selection of the best prediction model will be made using Akaike’s Information Criteria (AIC)35 and Bayesian Information Criteria (BIC).36 AIC and BIC are methods for scoring and selecting a prediction model that best fits the dataset after correcting/penalising the model complexity, that is, adding a penalty value for the number of added parameters in the complex model compared with the simpler model. The model with less AIC or BIC value will be considered the best prediction model for the dataset. A p-value <0.05 will be regarded as statistically significant in the multivariable analysis.

    Sensitivity analysis

    We will test the robustness of our findings by reanalysing our main results using a two-stage random-effects IPD-MA approach. We will examine and compare the differences between one-stage and two-stage IPD-MA approaches.37 We will conduct a sensitivity analysis to assess the various sources of heterogeneity, including study-level characteristics such as follow-up duration, risk of bias, publication year, and study country. We will assess the quality of evidence using the Grading of Recommendations, Assessment, Development and Evaluations (GRADE) methodology.38 39

    Validation

    We will validate the IPD datasets by conducting internal and external validation of the prognostic models. For this, we will perform bootstrapping, a method less susceptible to bias and results in more stable model development. For external validation, we will conduct data and attribute distribution analysis.

    For external validation, we will use one or two datasets reserved for this purpose and conduct fivefold or tenfold cross-validation.

    Missing data

    We will evaluate our datasets for the extent of missing data and if it is purely random or if there is an explanation. If the datasets contain missing data and no answers are available, we will assume that to be missing at random. We will use the ICE multiple imputation methods to impute missing data.40

    Patient and public involvement

    Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

    Ethics and dissemination

    This article describes a study protocol for an individual participant data meta-analysis. Ethics approval is not required for this study. The results will be published in peer-reviewed journals. This study will contribute to IPSERR, which will be available to researchers for future PSE research projects. It will also serve as a platform to anchor future clinical trials.

    Discussion

    This IPD-MA will provide precise and reliable estimates for investigating the outcomes and predictors in PSE. The IPD lodged within the IPSERR will catalytically advance PSE research, because it will give the PSE research community an extensive database for hypotheses testing and a framework to anchor future prospective studies. By integrating data from various sources, standardising data collection and developing prediction models, this study aims to provide valuable insights into PSE outcomes and facilitate the development of effective interventions and preventive strategies. The findings of this IPD-MA will contribute to advancing knowledge in PSE research and support evidence-based decision-making for improved patient care.

    This is the first IPD-MA that attempts to characterise the outcomes and predictors in PSE. The strengths of our study include more accurate and comprehensive analyses and the ability to explore interactions between covariates and outcomes. An IPD-MA enables us to analyse data at the individual level rather than relying on study-level information. Our study is conducted under the umbrella of IPSERC, which ensures the inclusion of large and diverse patient datasets that would otherwise remain scattered and underutilised. Through the IPSERR, we aim to standardise data collection, define common data elements, and establish criteria for reporting PSE research. This ensures consistency across studies and enhances the quality and reliability of our findings. Our study includes a wide range of outcome measures providing a holistic understanding of the impact of PSE on patients. Our study will determine reasonable time to patient follow-up in a future antiepileptogenesis trial. Our study aims to build prediction models for PSE and validate and compare their performance against existing models. This ensures the reliability and generalisability of our predictions and helps identify the most accurate and practical models for clinical decision-making.

    However, the process and interpretation of an IPD-MA may be impacted by several factors. Inconsistencies across studies regarding which variables are reported and how they are reported can limit our ability to assess the impact of potential predictors and outcome measures. Additionally, there may be some inclusion bias if we cannot obtain IPD for all eligible studies. However, we will address this issue through sensitivity analyses that include published aggregate data whenever possible. The included studies may vary in terms of their design and population characteristics. This heterogeneity may impact the generalisability and comparability of our findings. We will conduct sensitivity and subgroup analyses to explore and account for potential sources of heterogeneity. Considering these limitations and employing appropriate analytical approaches, we aim to mitigate potential biases and provide comprehensive and reliable results. Once IPSERR and the common data elements are established, adding prospective data, collected with the standard data elements incorporated in the IPSERR, will create many more possibilities to study PSE, for example, anchoring future clinical trials to test drug safety and efficacy on this framework.

    Ethics statements

    Patient consent for publication

    Acknowledgments

    The IPSERC executive members include Nishant K Mishra, Patrick Kwan, Alon Friedman, Jerome Engel Jr, Vijay K Sharma and Jacqueline A French.

    References

    Supplementary materials

    • Supplementary Data

      This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Footnotes

    • Twitter @nkmishramd

    • Collaborators International Post Stroke Epilepsy Research Consortium (IPSERC): Nishant K Mishra (Department of Neurology, Yale University School of Medicine, New Haven, CT, USA (nishant.mishra@yale.edu)), Patrick Kwan (The AIM for Health, Faculty of IT, Monash University, Australia (patrick.kwan@monash.edu)), Alon Friedman (Department of Brain and Cognitive Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel (alonf@bgu.ac.il), Department of Medical Neuroscience, Dalhousie University, Halifax, Canada (alon.friedman@dal.ca)), Jerome Engel (Department of Neurology, University of California Los Angeles, Los Angeles, USA (JEngel@mednet.ucla.edu)), Vijay K Sharma (Yong Loo Lin School of Medicine, National University of Singapore and Division of Neurology, National University Health System, Singapore (drvijay@singnet.com.sg)), and Jacqueline A French (Department of Neurology, NYU Grossman School of Medicine, New York City, USA (jacqueline.french@nyulangone.org))

    • Contributors NKM conceptualised the study. NKM and SM drafted the study protocol. NKM, SM, TT, MCF and JZ contributed to the original data acquisition. YZ, SW, FS and ZG provided inputs in drafting the machine learning analyses plan for the study. PK, TT, KSS, JD, YZ, SM, SW, VKS, RM, MCF, MI, JPN, DL, CLY, FC, TQ, ZG, FS, JZ and SEK critically reviewed the protocol for any methodological concerns or missing intellectual content. NKM is the guarantor of the review. All authors read and approved the final version of the manuscript.

    • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

    • Competing interests None declared.

    • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

    • Provenance and peer review Not commissioned; externally peer reviewed.

    • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.