Article Text

Download PDFPDF

Protocol
Integrated Module of Multidimensional Omics for Peripheral Biomarkers (iMORE) in patients with major depressive disorder: rationale and design of a prospective multicentre cohort study
  1. Yuzhen Zheng1,
  2. Linna Zhang1,
  3. Shen He2,
  4. Zuoquan Xie3,
  5. Jing Zhang4,
  6. Changrong Ge4,
  7. Guangqiang Sun4,
  8. Jingjing Huang2,5,
  9. Huafang Li2,5,6
  1. 1Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China
  2. 2Department of Psychiatry, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China
  3. 3State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
  4. 4Shanghai Green Valley Pharmaceutical Co Ltd, Shanghai, China
  5. 5Clinical Research Center for Mental Health, Shanghai Mental Health Center, Shanghai, China
  6. 6Shanghai Key Laboratory of Psychotic Disorders, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China
  1. Correspondence to Prof Huafang Li; lhlh_5{at}163.com; Dr Jingjing Huang; jjhuang_att{at}163.com

Abstract

Introduction Major depressive disorder (MDD) represents a worldwide burden on healthcare and the response to antidepressants remains limited. Systems biology approaches have been used to explore the precision therapy. However, no reliable biomarker clinically exists for prognostic prediction at present. The objectives of the Integrated Module of Multidimensional Omics for Peripheral Biomarkers (iMORE) study are to predict the efficacy of antidepressants by integrating multidimensional omics and performing validation in a real-world setting. As secondary aims, a series of potential biomarkers are explored for biological subtypes.

Methods and analysis iMore is an observational cohort study in patients with MDD with a multistage design in China. The study is performed by three mental health centres comprising an observation phase and a validation phase. A total of 200 patients with MDD and 100 healthy controls were enrolled. The protocol-specified antidepressants are selective serotonin reuptake inhibitors and serotonin–norepinephrine reuptake inhibitors. Clinical visits (baseline, 4 and 8 weeks) include psychiatric rating scales for symptom assessment and biospecimen collection for multiomics analysis. Participants are divided into responders and non-responders based on treatment response (>50% reduction in Montgomery-Asberg Depression Rating Scale). Antidepressants’ responses are predicted and biomarkers are explored using supervised learning approach by integration of metabolites, cytokines, gut microbiomes and immunophenotypic cells. The accuracy of the prediction models constructed is verified in an independent validation phase.

Ethics and dissemination The study was approved by the ethics committee of Shanghai Mental Health Center (approval number 2020-87). All participants need to sign a written consent for the study entry. Study findings will be published in peer-reviewed journals.

Trial registration number NCT04518592.

  • depression & mood disorders
  • biotechnology & bioinformatics
  • adult psychiatry
http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • Due to the complexity and heterogeneity of depression, the multidimensional systems biology approach in this study may help early identification of antidepressants with potential response, reducing unnecessary drug exposure.

  • Based on the biomarkers discovered in this study, a network of dynamic treatment response is better understood and subsequent clinical trials will be performed for further developments.

  • Lack of randomisation on treatment assignments may bring confounding effects influencing results.

  • Our short follow-up duration may limit us from observation about long-term predictors of treatment response.

Introduction

Major depressive disorder (MDD) is a common and chronic mental disorder, affecting approximately 6% of the global population annually.1 MDD characterised by low mood and anhedonia, continues to become a heavy societal burden2 3 which contributes to functional disability, decreased life quality, occupational impairment and mortality risk.4–7 The 2019 Global Burden of Disease Study estimates that MDD accounts for 1.47% of the global disability-adjusted life years, an increase of 15.5% since 2010.8

Currently antidepressant is the most common treatment for MDD relying primarily on clinicians' practice experience and preferences. However, the overall efficacy of first-line medicines is far from satisfactory, leading to response rates of about 50% with remission rate even more limited.9–11 As shown in the sequenced treatment trial STAR*D,12 36.8% of patients remitted after a first trial, and roughly 13% achieved ultimate remission after two sequential treatments. Patients respond to initial treatment inadequately to experience medication adjustment, even multiple times, until finding the 'optimal drug'. Since antidepressants take weeks or longer to show a therapeutic effect, such a trial-and-error approach is inefficient resulting in a prolonged treatment period, which is associated with worse outcome and increased burden of adverse events (AEs), healthcare resources use and suicide risk.13–18 More highly effective strategies are urgently required for clinical therapeutics.

In a previous network meta-analysis of efficacy,19 small differences were indicated between antidepressants. Therefore, clinicians need to choose the most proper drug from numerous candidates. The search for prediction of individual drug responses is an essential issue to move precision medicine forward. Some factors have proven limited accuracy as a single predictor and lack unified standards, including sociodemographic, course of treatment and clinical characteristics.9 20 21 Studies of genetics, neuroimaging and electrophysiology are expanding fields in areas of predictive biosignatures, where no reliable biomarker currently exists for clinical assessments.22–24

As tools of prediction in genomics,25 26 pharmacogenomics is widely used to evaluate drug–gene interactions and impacts on efficacy by genetic polymorphism. Antidepressants are largely metabolised by cytochrome P450 (CYP450) enzymes family (CYP2C9, CYP2C19, CYP2D6, etc), while related genetic variations substantially modify the pharmacokinetics leading to individual differences.27–29 Commercial kits of pharmacogenomics have been used in some clinical trials to aid the drug selection with improved outcomes.30 31 More complicated factors, however, need to be considered because metabolic phenotypes are influenced not only by genotype but also by environmental factors, such as age, nutrition, comorbidity and intestinal microecology. Genetic‐based approaches alone seem insufficient to guide individualised treatment, including models using single nucleotide polymorphisms or Genome-Wide Association Study.32–34 More data-driven approaches of systems biology should be utilised.

A series of proteins have been suggested as candidate predictors but most of them lack enough power or consistent results, such as brain-derived neurotrophic factor, insulin-like growth factor-1, tumour necrosis factor-α, C reactive protein and so on.35–40 While explorations of proteomic markers in peripheral blood remain in their early stages, the potential technique may provide more information on psychopharmacological mechanisms.41 42 In a proteomics-based study, eotaxin-1 and interferon-γ were screened and functioned as predictors for remission in depression from numerous proteins.43 As a complementary method to genetics, proteins and environmental interactions, metabolomics profiling help to discriminate responders from non-responders in biologic subtypes for treatment. Metabolites such as lipids, purines, tryptophan and neurotransmitter pathways are revealed to involve in mechanism of action of antidepressants.44 45 An increased ratio of hydroxylated sphingomyelins in pretreatment showed a better reduction of symptoms and increased phosphatidylcholine C38:1, in contrast, suggested poorer response; predictions were improved by incorporating metabolites factors.46 Our previous research also supported metabolomics has potential in biomarkers exploration related to the diagnosis and treatment of mental disorders.47 48 In addition, emerging fields, known as epigenomics and microbiome, have shown some degree of association with prognosis.49–52

Current studies of gut microbiota–brain axis and neuroimmunology suggest a need for integrated analysis. Decreased faecal microbiota richness and diversity were observed in some MDD studies and associated with altered serum metabolites and decreased immunoglobulin.53 54 A case–control study indicated that proinflammatory genera were enriched in the depression group, whereas anti-inflammatory genera were reduced, corresponding to altered bacterial functions (especially immunomodulatory metabolites) and host cytokines expression profiles.55 Due to the high complexity and heterogeneity of MDD, both in psychopathology and prognosis, a combination of more dimensions should outperform predictions obtained from a single approach.32 56 Previous or ongoing international clinical trials of pivotal, such as the Establishing Moderators and Biosignatures of Antidepressant Response in Clinical Care (EMBARC) and Texas Resilience Against Depression (T-RAD), have adopted the approach based on multimodal data.57 58 Higher accuracy in predicting outcome for depression was revealed (area under the curve: 0.86) by integrating genomic and metabolomic markers.59

Objectives

Integration of multiomics data in MDD clinical studies remains scarce in current literature. The study, an Integrated Module of Multidimensional Omics for Peripheral Biomarkers (iMORE) is designed to predict and assess response to antidepressants through a multistage cohort, including selective serotonin reuptake inhibitor (SSRI) and serotonin–norepinephrine reuptake inhibitors (SNRIs), using multiomics integration and machine learning strategy. We aim to construct models with high predictive power and validated the accuracy in an independent prospective stage, by integrating multiple sources of omics data (metabolomics, microbiomes, etc). As secondary aims, a series of potential diagnostic biomarkers are to be explored for the MDD biologic subtypes.

Methods and analysis

Study design

The Integrated Module of Multidimensional Omics for Peripheral Biomarkers (iMORE) study is performed by three mental health centres in Shanghai, China. The whole study comprises two stages: the observation phase and the validation phase, with each stage for 8 weeks (figure 1). Predictive accuracy of the model for antidepressants response constructed in the observation phase is verified in the validation phase. iMORE features a practical design to better reflect real-world efficacy, where the antidepressants (SSRIs, SNRIs) and dosages are adjusted based on clinical judgement and the patient’s willingness. Participants recruitment has started in December 2020 and the study is estimated to be finished in 2023.

Figure 1

Overview of the iMORE study. iMORE, Integrated Module of Multidimensional Omics for Peripheral Biomarkers.

Stage 1: observation phase

A total of 150 participants with MDD and 50 healthy controls are recruited in a prospective, observational cohort initially. Participants receiving SSRI or SNRI during observation are enrolled in a 1:1 ratio before starting. The protocol-specified SSRI includes fluoxetine, paroxetine, sertraline, citalopram and escitalopram; SNRI includes venlafaxine and duloxetine. Additional drug for associated symptoms or side effects are allowed, including concurrent treatments for comorbidities. For depressed subjects, visits occur at baseline, week 4 and week 8; only baseline is taken in healthy controls. Assessment in visits consists of sociodemographic, clinical features, drug exposure, biospecimen collection and so on (table 1). Sample collected are analysed mainly by multiomics technologies, including cytokines, metabolomics and gut microbiome. After 8 weeks, participants with MDD are divided into groups responder and non-responder based on treatment response (>50% reduction in baseline Montgomery-Asberg Depression Rating Scale (MADRS)), then prediction models for SSRI and SNRI are built, respectively, by multidimensional data integration. Meanwhile, biomarkers of potential diagnostic value are mined from obtained data to identify molecular subtypes in participants.

Table 1

Schedule of assessments

Stage 2: validation phase

The accuracy of prediction models in a previous stage is validated in an independent cohort of 50 participants and 50 healthy controls. In this phase, the workflow is close to Stage 1 (table 1), with the same regulation on antidepressants and other drug. The baseline dataset of participants is input to predict the response to antidepressants, which is verified by the actual efficacy after 8 weeks of treatment. Potential biomarkers discovered in the previous phase for diagnosis or discrimination of subtypes are also tested for clinical value. Longitudinal multidimensional datasets during treatment in all stages are employed to perform network analysis among core biomarkers for interaction analysis.

Patient and public involvement

Patients or members of the public were not involved in the design, or conduct, or reporting or dissemination plans of this study. Study finds could be disseminated to the participants by emails if they prefer.

Study sites and participants

A total of 200 participants with MDD are recruited form Shanghai Mental Health Center (affiliated to Shanghai Jiao Tong University School of Medicine), Shanghai Pudong New Area Mental Health Center (affiliated to Shanghai Tongji University School of Medicine) and Shanghai Huangpu Area Mental Health Center. All 100 healthy controls are recruited from the population of communities, students, hospital staff and so on. Depressed subjects aged 18–65 are diagnosed according to Diagnostic and Statistical Manual of Mental Disorders, fifth edition criteria for MDD. Participants should have moderate-to-severe symptoms at screening and receive antidepressant treatment during the study. Patients with high suicide risks, severe concomitant medical conditions and other mental disorder are mainly excluded (for inclusion and exclusion criteria see box 1). Healthy controls are matched in age, sex and education, and their exclusion is close to the criteria of the MDD group.

Box 1

Inclusion and exclusion criteria for the study entry

Inclusion criteria

  • Age 18–65.

  • Inpatients or outpatients; gender not limited.

  • Meets DSM-V criteria for single or recurrent non-psychotic MDD and related specifiers.

  • Taking or about to take SSRI or SNRI antidepressants.

  • Total MARDS ≥24 at screening.

  • Total HAMD-17 ≥20 at screening.

  • Provide written informed consent.

Exclusion criteria

  • Concomitant other mental disorder (in addition to MDD).

  • Suicidal risk (defined by suicide attempt within a year, or scores >3 on suicidal thoughts of MARDS).

  • Substance dependence in the past 6 months (except for nicotine).

  • The major depressive episode of organic mental disorders secondary to neurological diseases or systemic illnesses.

  • Severe or unstable general medical conditions.

  • Clinically significant laboratory abnormalities (including ECG).

  • Diagnosed gastrointestinal diseases (tumour, inflammatory bowel disease, diarrhoea, constipation, etc).

  • History of antibiotic, non-steroidal anti-inflammatory agents, probiotics, immunosuppressants or corticosteroid intake in the past 3 months.

  • Women planning to conceive during the study period, or current pregnancy or breast feeding.

  • DSM-V, Diagnostic and Statistical Manual of Mental Disorders, fifth edition; MADRS, Montgomery-Asberg Depression Rating Scale; MDD, major depressive disorder; HAMD-17, Hamilton Depression Rating Scale; SNRI, serotonin‐norepinephrine reuptake inhibitor; SSRI, selective serotonin reuptake inhibitor.

Clinical visits

Clinical data collection

Clinical information collected includes sociodemographic, course of treatment, family history, drug exposure and so on. The onset of MDD and related treatment descriptions are documented in detail. Drug information (eg, reasons, dosage, duration) during the study period is recorded, including concomitant medication. Depressive symptoms are assessed by the MADRS, Hamilton Depression Rating Scale (HAMD-17), Clinical Global Impression Scale (CGI-S, CGI-I), and anxious symptoms are assessed by the Hamilton Anxiety Rating Scale (HAMA). Participants' cognitive function is measured by the Montreal Cognitive Assessment Scale (MOCA). The Pittsburgh Sleep Quality Index (PSQI) and the Quick Inventory of Depressive Symptomatology (QIDS-SR16) are both self-rated scales to evaluate sleep quality and depressive symptoms, respectively. Except for the self-rated scales, all assessments are completed by assessors through a semistructured interview.

Biospecimen collection

Collections of venous blood samples are carried out in accordance with the standard operating procedures at each study site, then shipped on dry ice to the central laboratory within 24 hours. A total of 12.5 mL of venous blood is collected (from 08:00 to 17:00) from each subject for each visit, using whole blood RNA test tubes and EDTA tubes. Postprandial time needs to be recorded if subjects are not in fasting state. After isolation, samples of whole blood, plasma and blood cells are stored according to the corresponding conditions until further tests. General laboratory tests include liver function, renal function, lipids, glucose, and serum thyroid hormones.

Mass cytometry analysis of blood immune cells

Human blood samples collected in EDTA tubes are centrifuged at 500 g for 10 min. Blood cells are subjected to red blood cell lysis at room temperature, then washed with staining buffer, filtered through a 70 µm strainer and counted. Metal-labelled antibodies are used for staining according to the manufacturer’s instructions (Fluidigm Science, San Francisco, USA). Cells are fixed with 1.6% paraformaldehyde and are processed in Ir-Interchelator (Fluidigm) then incubated at 2°C–8°C. Before acquisition, cells are resuspended with Cell Acquisition Solution (Fluidigm) containing diluted EQ Four Element Calibration beads (Fluidigm) and filtered through a 35 µm nylon mesh filter cap. Finally, cells are obtained on a Helios Mass Cytometer (Fluidigm) and are exported and analysed using the Cytobank analysis software.

Cytokines assessments

Targeted cytokines of plasma samples are processed by microarray technology with the Quantibody Human Cytokine Antibody Array 440 Kit (Ray Biotech, Norcross, USA). The multiplexed ELISA-based quantitative array platform determines the concentration of up to 440 human cytokines simultaneously. After signals visualisation by laser scanners, raw fluorescence data are acquired from the array-specific software (eg, GenePix).

Metabolomics assessment

Targeted metabolomics of plasma samples is performed using MxP Quant 500 Kit, based on flow injection analysis and liquid chromatography-based triple quadrupole mass spectrometry. The whole workflow is processed on the Met/DQ platform, including statistics test. The platform allows for simultaneous detection and quantification of up to 630 metabolites in plasma, representing 26 analyte classes.

Gut microbiomes assessments

All faecal samples are collected by participants at home with a standardised collection device, then the samples are delivered to the central laboratory within 24 hours and subsequently stored at −80°C. Subjects first excrete faecal samples on a special faecal collection paper provided to avoid contamination with urine and other substances. A total of three test tubes of faecal samples (inner part of the middle and rear sections) are taken followed by mixing with the reagent solution in the tubes and setting them in a sealed bag with ice. Bacterial 16S ribosomal RNA gene sequencing assay is used to investigate the gut microbiome diversity of participants. After purification, PCR amplification and complementary DNA library construction, gene sequencing is performed on the Illumina MiSeq System (Illumina, San Diego, USA). Quality control and filtration of sequence quality are conducted to distinguish the sample reads, followed by cluster and taxonomy analyses.

Sample size estimation

Quite limited methods are provided for the sample size calculations in multiomics studies, where participants varied from dozens to hundreds of samples within each group.49 53 60 Therefore, our sample size is selected according to previous studies and the setting of this study. A total of 200 participants with MDD and 100 healthy controls is estimated. According to the area under the curve (AUC) range in previous prediction models, we estimate the AUC of the model developed to be at least 0.70.61 The sample size has been selected to provide statistical power of at least 80% power to detect a difference of 0.10 of AUC value (assuming a 50% response rate of patients with MDD in the cohort), using a two-sided z-test at a significance alpha level of 0.05. For main parameter of interest, an effect size criteria for models is applied that ORs should exceed 1.2.

Outcome

Primary outcome

The primary outcome measure is the change from baseline in MARDS score at week 8 in participants with MDD. The larger reduction in MARDS score demonstrates a better improvement in depressive symptoms or therapeutic effect.

Secondary outcome

The secondary outcome includes response and remission rate at week 8. Response and remission rate are calculated based on the score of MARDS and HAMD-17. Response rate is defined as the proportion of participants with a decrease of more than 50% in score on the depression scale (MARDS, HAMD-17). Remission rate is defined as the proportion of participants with remission: HAMD-17 scores ≤7 or MARDS scores ≤ 10. Change in scale score from baseline is compared between visits as secondary outcome. Larger reduction in depression and anxiety scale score indicates more improvement in symptoms. A score of MOCA <26 suggests the presence of mild cognitive impairment, and a decrease in PSQI score demonstrates improved sleep quality in subjects. The Global Impression Scale (CGI-S, CGI-I) is used to measure the overall severity and degree of improvement of MDD; the lower the participants score, the greater the treatment efficacy.

Adverse events

Any AEs that occurred during the study are recorded and handled timely. Serious AEs are reported to the institutional review board within 24 hours. Only adverse drug reactions (ADRs) with a definite, probable or possible causality are included for safety analysis in the study. General laboratory tests during visits help to identify the ADRs.

Data collection and management

The original data are recorded first in case report form (CRF), including sociodemographic, clinical information, AEs and so on. Onsite checks for quality control are conducted periodically to ensure that researchers strictly adhere to the standard operating procedures and fill in the information correctly. Electronic CRF is employed simultaneously to collect data based on an electronic data management system, built and run by the computer department of Shanghai Mental Health Center. Double data entry and proofreading are conducted on the system by two independent data entry personnel at each study centre, within 5 days of raw data generation. After all participants complete the study and the integrity of the data is systematically checked, the database is locked until later analysis by statistical analysts. The biospecimens and test results are managed by the central laboratory independently and accessible by authorised personnel in the study.

Quality control

iMORE is initiated by the Clinical Research Centers of Shanghai Mental Health Center, which is also responsible for coordinating other study centres. All assessors undertake the good clinical practice (GCP) training and training on the use of scales before the start, and are required to pass a concordance test for eligibility. The Hospital Development Center (Shanghai, China) supervises the whole study process and performs periodical reviews on researcher staff qualification, informed consent, data quality and so on.

Statistical analysis strategy

Data analysis is performed according to the intention-to-treat principle on the full-analysis set. Missing data are handled with the corresponding approach (deletion or imputation) according to the reason. Trend analysis for response rate, remission rate and scale scores are performed using the generalised linear mixed models or mixed model for repeated measurements.

Construction and validation of models

Prediction models for SSRI and SNRI are constructed respectively based on the multidimensional data integration, including clinical characteristics in Stage 1. Subjects with MDD are categorised into two groups: responder and non-responder group based on the response at week 8. Each group is randomly selected with two-thirds as the training set and the rest as the testing set; the same process is repeated with fivefold cross-validation for the optimal result. Before data concatenation of multidimensional data as an input matrix, deep learning approaches (eg, autoencoder) are applied to select feature subsets associated with outcome phenotype. Autoencoder is a non-linear factorisation technique with multilayer neural network structures to learn data representation by reducing dimensionality.62–64 Models are constructed mainly based on supervised machine learning approaches, using several algorithms (linear and non-linear) simultaneously for comparisons across models. Predictors are ranked by their importance in predicting according to the value of coefficients. Algorithms for reference include elastic net regression, support vector machine and random forest. To conservatively evaluate the clinical value of multiomic data, a model based on clinical characteristics alone is performed for comparison. Validation of all models are conducted in the participants with MDD in Stage 2 with data at baseline and endpoint, and the AUC is the main index to evaluate the accuracy.

Network analysis

Network analysis for treatment response is conducted on subjects with MDD in all stages with longitudinal data. A highly interconnected network in biomarkers of all omic features is built to demonstrate the interaction and regulatory direction. The network analysis is performed on the tool xMWAS65 based on the sparse partial least squares (sPLS) regression. sPLS is a classification method capable of selection and integration at the same time in a number of highly correlated variables.

Biologic subtyping

Diagnostic biomarkers for MDD are explored using data from 100 healthy controls including initial assessment in Stage 2. Biological phenotyping of MDD is based on the variables of 150 subjects with MDD in all stages by integration analysis, such as biological signatures with severe sleep disturbances or anxiety symptoms. Similar to the modelling process, multidimensional features are input as a matrix followed by dimensionality reduction. Cox proportional hazards analysis and least absolute shrinkage and selection operator (LASSO) regression are applied for further feature selection, which identifies the molecular subtypes after K-means clustering.

Ethics and dissemination

This study has been approved by the ethics committee of Shanghai Mental Health Center (approval number 2020-87). The study is conducted in accordance with the principle of the Declaration of Helsinki and the GCP guidelines, and all written informed consent is obtained from participants before enrolment. Study findings will be published in peer-reviewed journals.

Discussion

The need for precision medicine in antidepressant treatments remains unmet. SSRIs and SNRIs are the most common first-line antidepressants, facing the treatment dilemma for its limited response.66 Finding easily accessible predictive biomarkers help early identification of potential benefits to certain drug, reducing cost of treatment and unnecessary drug exposure. Studies have shown the richness of the neurobiological mechanisms of MDD: increased activation of inflammatory system,67 correlations between epigenetic regulation and stress environment,68 multiple influencing factors from intestinal microbiota,69 aberrant neural circuits 70 and so on. In such a complicated mechanism background, data-driven approaches based on multiomics integration rather than hypothesis-driven methods might be the solution. The predictability of outcome is improved and relationships between biomolecules are better elucidated by the systems biology method.

Biomarkers obtained from peripheral blood and stool samples remain practical currently, and other potential makers, for example, neuroimaging, are far from large-scale applications. Multidimensional integration of peripheral indicators allows comprehensive reflection of dynamic alterations in central nervous system functions. Machine learning (especially, deep learning) in theory has its advantage in multiomics analysis although there are no current standards.71 In addition to the algorithm in the analysis plan, successful cases include penalised regression, extreme gradient-boosting (XGBoost) and multiomics late integration.59 72

Similar to other omics studies, high dimensionality and a relatively small sample size are issues to tackle in this study because more types of omics are involved, making it more challenging. To avoid overfitting in models learning, a leave-one-out approach can be used and training can be terminated early when overfitting occurs.73 Lack of randomisation on treatment may bring confounding effects influencing results. Our results perform direct validation in a real-world setting than the previous cross-trial replication method could help alleviate the limitations.74 75 To date, no established mechanism in the physiopathology of MDD can be applied to biological subtyping before treatment. In conclusion, this is a large, ongoing research on precision medicine in depression including multiple stages, and providing rich biologic information for mechanistic studies. More targeted subsequent clinical trials will be planned for the potential subtypes discovered in this study.

Ethics statements

Patient consent for publication

Acknowledgments

iMORE is supported by Shanghai Hospital Development Center and Shanghai Mental Health Center. We greatly acknowledge the iMORE investigators group and the Clinical Research Centers of Shanghai Mental Health Center for their contributions to the study.

References

Footnotes

  • Contributors HL is the principal investigator and designed the study. YZ wrote the protocol. ZX, JZ, CG and GS are responsible for laboratory analyses and interpretation. LZ and SH are responsible for participants recruitment, data collection and data management. JH offered many constructive opinions on this protocol. The manuscript has been read and approved by all authors.

  • Funding This study is supported by a grant from the Shanghai Hospital Development Center (SHDC, Grant Number: SHDC2020CR2053B). The SHDC has approved this study and had no role in study design, nor in the collection, analysis and interpretation of the data or in the writing of the report.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.