Article Text

Efficacy and acceptability of next step treatment strategies in adults with treatment-resistant major depressive disorder: protocol for systematic review and network meta-analysis
  1. Jan Jacobus Muit1,
  2. Philip F P van Eijndhoven1,
  3. Andrea Cipriani2,3,
  4. Iris Dalhuisen1,
  5. Suzanne van Bronswijk4,
  6. Toshi A Furukawa5,
  7. Henricus G Ruhe1
  1. 1Department of Psychiatry, Radboud University Nijmegen, Nijmegen, The Netherlands
  2. 2Department of Psychiatry, University of Oxford, Oxford, UK
  3. 3Oxford Health NHS Foundation Trust, Warneford Hospital, Oxford, UK
  4. 4Department of Psychiatry and Psychology, Maastricht University, Maastricht, The Netherlands
  5. 5Departments of Health Promotion and Human Behavior and of Clinical Epidemiology, School of Public Health, Kyoto, Japan
  1. Correspondence to Jan Jacobus Muit; bob.muit{at}


Introduction For major depression, a one-size-fits-all treatment does not exist. Patients enter a ‘trial-and-change’ algorithm in which effective therapies are subsequently applied. Unfortunately, an empirically based order of treatments has not yet been determined. There is a magnitude of different treatment strategies while clinical trials only compare a small number of these. Network meta-analyses (NMA) might offer a solution, but so far have been limited in scope and did not account for possible differences in population characteristics that arise with increasing levels of treatment-resistance, potentially violating the transitivity assumption. We; therefore, present a protocol for a systematic review and NMA aiming at summarising and ranking treatments for treatment-resistant depression (TRD) while covering a broad range of therapeutic options and accounting for possible differences in population characteristics at increasing levels of treatment-resistance.

Methods and analysis Randomised controlled trials will be included that compared next-step pharmacological, neuromodulation or psychological treatments for treatment-resistant depression (TRD; ie, failure to respond to ≥1 adequate antidepressant drug trial(s) in the current episode) to each other or to a control condition. Primary outcomes will be the proportion of patients who responded to (efficacy) and dropped out of (acceptability) the allocated treatment. A random effects NMA will be conducted, synthesising the evidence for each outcome and determining the differential efficacy of treatments. Heterogeneity in treatment nodes will be reduced by considering alternative geometries of the network structure and by conducting a meta-regression examining different levels of TRD. Local and global methods will be applied to evaluate consistency. The Cochrane Risk of Bias 2 tool, Confidence in Network Meta-Analysis and the Grading of Recommendations Assessment, Development and Evaluation (GRADE) framework will be used to assess risk of bias and certainty.

Ethics and dissemination This review does not require ethical approval.

  • adult psychiatry
  • depression & mood disorders
  • clinical trials
  • adverse events

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • The systematic review and meta-analysis will follow the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines.

  • We will address the potential heterogeneity arising from different levels of treatment-resistant depression.

  • Heterogeneity within treatment nodes will be limited by considering alternative geometries of the network structure.

  • This study does not address quality of life.

  • Limitations of primary studies will be assessed using the Cochrane Risk of Bias 2 tool, Confidence in Network Meta-Analysis and the Grading of Recommendations Assessment, Development and Evaluation framework.


Depression has been one of the leading causes of non-fatal health loss for nearly three decades, with major depressive disorder (MDD) affecting 163 million people worldwide in 2017.1 No one-size-fits-all treatment exists.2 3 Patients enter a ‘trial-and-change’ algorithm in which evidence-based treatments are subsequently applied.4 Unfortunately, there is no empirically based optimal treatment sequence determined yet.

In order to consider a depression to be treatment-resistant, several adequate treatment trials of sufficient dosage and length must have been previously applied. Definitions of ‘treatment-resistance’ range from nonresponse to one antidepressant medication (ADM) (after ≥4 weeks of treatment) to a failure to respond to more than 10 adequate trials of different classes of ADM and augmentation strategies, electroconvulsive therapy and psychological treatments, taking into account factors such as disease severity, comorbidity, functional impairment and intensity of treatment.5 6 However, most recent insights suggest to use a dimensional approach to define levels of treatment-resistant depression (TRD).6–12 In addition, TRD is often confused with ‘pseudoresistant’ depression, a term used to describe non-response to antidepressant trials of inadequate dosage and duration.13

Common strategies for treatment-resistance to ADM include dose-escalation and switching.14–17 Dose-escalation of the first ADM has extensively been addressed in previous research. It was found that beyond 20–40 mg fluoxetine equivalents for selective serotonin reuptake inhibitors (SSRI) and above 30 mg mirtazapine, efficacy does not increase, leaving limited room for dose-escalation in non-responders to these dosages.18–20 However, it was found that adding or switching to mirtazapine was superior to continuing sertraline among previously untreated patients.21 The Sequenced Treatment Alternatives to Relieve Depression (STAR*D) trial aimed to ascertain whether certain treatments were more optimal after one or more failed trials.2 22 No differences were found between any of the next-step treatment strategies. However, it was found that patients with higher levels of treatment-resistance showed lower rates of remission, as remission rates dropped after two failed trials (remission rates of 36.8%–30.6% after step 1 and 2 vs 13.7%–13.0% after steps 3 and 4). The authors hypothesised the steep reduction in remission rates after step 2 occurred due to differences in population characteristics (eg, presence of comorbid medical or psychiatric disorders, or degree of chronicity) and general heterogeneity of MDD. Alternatively, poor monitoring of nortriptyline or lithium levels and inadequate dosing of monoamine oxidase (MAO)-inhibitors might explain the poor responses in steps 3 and 4 in STAR*D. Nevertheless, the decreases in response and remission rates after the second ADM might be related to a selection process of patients that are non-responsive to all types of mono-aminergic ADM.23 This could explain the slight advantage of between-class over within-class switches after a first ADM,17 24 25 but it remains to be shown empirically whether this selection effect is indeed applicable to increasing levels of TRD. Hypothetically, treatments targeting different pathways might provide better efficacy in these cases.

Several efforts have been undertaken to perform network meta-analysis (NMA) for TRD;26–30 however, overall conclusions are impeded by various factors. First, these NMAs employed various definitions of TRD: for example, two of them also included patients with only one failed adequate trial in the current episode.26 27 Second, these NMAs studied various types of interventions: from only a few augmentation strategies26 or only neuromodulation strategies30 to several augmentation, pharmacotherapy switch and neuromodulation strategies.28 Third, only one study accounted for differences in dosages.26 Fourth, one study accounted for outcome measures at different points in time, ranging from 2 to 8 weeks, limiting the number of possible comparisons.28 Fifth, the most recent study investigating multiple modalities grouped treatments based on the presumed mechanisms of action, without clear description of considerations regarding the treatment network.29 Although Wang et al31 stratified for number of failed ADM in a pairwise meta-analysis, none of the NMAs26–29 were able to account for levels of TRD and possible differences in population characteristics that might arise with increasing levels of TRD,2 which might violate the transitivity assumption for NMA. Violation of the transitivity assumption would make estimating indirect comparisons from unobserved head-to-head comparisons invalid.32 Neither were these studies able to evaluate whether higher levels of treatment-resistance respond to more aggressive or invasive treatments.7 29

In summary, current research is affected by several complicating factors. No common consensus on the definition of TRD exists. A magnitude of different treatment strategies is available while clinical trials usually only compare a small number of these. NMAs performed so far are limited in scope and do not account for possible differences in population characteristics that might arise with increasing levels of TRD, potentially violating the transitivity assumption. Therefore, a more comprehensive approach to summarise and determine relative efficacy of treatments for TRD is needed.


The aim of this systematic review and NMA is to evaluate (1) the differential efficacy and acceptability of treatment strategies when administered after a failed ADM trial in adults with MDD; (2) whether differential efficacy and acceptability is dependent on the study level of treatment-resistance as defined by inclusion criteria used in the trials. These aims can be applied to the following clinical questions: (1) what are next-step treatment strategies in adult patients with TRD that are beneficial and/or safe? (2) how do the various treatment strategies compare to each other? (3) does the level of treatment-resistance affect the differential efficacy of next-step treatment strategies?

In order to answer the first clinical question, absolute and relative efficacy and acceptability of next-step antidepressant treatments for TRD will be examined using head-to-head and treatment-control comparisons in pairwise meta-analyses. To answer the second clinical question, relative efficacy and acceptability of the various next-step treatment strategies will be estimated in an NMA, while ranking their probabilities of highest efficacy and acceptability to inform the treatment algorithm for MDD. In order to answer the third clinical question, we will investigate the transitivity assumption by examining the impact of the study’s level of treatment resistance (ie, the number of failed antidepressant trials that studies required as an inclusion criterion) in an NMA with a meta-regression.


We used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Protocols,33 see online supplemental appendix 1. In case of protocol amendments, we will describe the date of each amendment together with a description of the change and the rationale. We performed a preliminary search in May 2021, and aim to submit the results in 2024.

Eligibility criteria

Types of studies

We include randomised controlled trials (RCTs), in which next-step pharmacological, neuromodulation or psychological treatment strategies are compared with each other or a control condition.

Quasi-randomised trials will be excluded, while cluster RCTs will be included when the clustering effect can be taken into account. For cross-over trials, the results from the first randomised treatment period will be included. We will exclude studies where there was a high risk of bias arising from the randomisation process.

Types of participants

We include studies with patients aged ≥18 years with unipolar MDD diagnosed by using any standard operationalised criteria, such as Feighner criteria, Research Diagnostic Criteria, Diagnostic and Statistical Manual of Mental Disorders Third Edition (DSM-III), DSM-III-R, DSM-IV, DSM-5 and International Classification of Diseases 10th Revision (ICD-10).

We require studies where patients failed to respond to ≥1 ADM trial(s) prescribed at least at a minimally effective dose for ≥4 weeks in the current episode.34 We will not exclude studies that considered intolerance to a previous treatment trial as a failure in their definition of TRD. Although intolerance to treatment could be considered pseudoresistance, in clinical practice, it might not always be possible to distinguish between failure and intolerance as information on previous failed trials is often based on historical information. We will include studies with both prospectively and historically assessed treatment failure.

Studies in which 20% or more of the participants are suffering from bipolar disorder, peripartum depression or psychotic depression will be excluded. We exclude RCTs that have included patients with a concurrent primary diagnosis of another psychiatric or personality disorder. A secondary diagnosis of another psychiatric disorder will not be considered an exclusion criterion. RCTs focusing on patients with a concomitant medical illness will be excluded.35 We include studies that allow use of rescue medications, if these medications were made equally available to all treatment groups.

Types of interventions

We distinguish eight types of next-step treatments covering different modalities: (1) Switching to a different ADM, (2) Combining continued ADM with another ADM, (3) Augmenting ADM with another psychopharmacological agent, (4) Switching to psychedelic or psychedelic-assisted therapy, (5) Switching treatment to neuromodulation treatment, (6) Augmenting ADM with neuromodulation treatment, (7) Switching treatment to psychological therapy and (8) Augmenting ADM with psychological therapy. For a more detailed overview, see online supplemental appendix 2.

We will obtain information about interventions of interest either from head-to-head or controlled trials. We exclude studies if the intervention is not targeted at the depressive disorder. Studies that coinitiated multiple interventions of interest will not be excluded and treated as a combined treatment.

  • Comparator interventions (switching or augmenting)

    • Alternative intervention (head-to-head).

    • Pill placebo.

    • Psychological placebo.

    • Sham neuromodulation.

    • Continuation of antidepressant treatment.

    • Treatment as usual (TAU; defined as standard non-protocolised treatment in primary or secondary care, typically with pharmacotherapy)

    • No treatment (NT; applies in case TAU involved virtually no intervention, defined as <50% of patients receiving any antidepressant treatment (including pharmacotherapy, psychological therapy and/or neuromodulation treatment); patients know they will not receive active treatment after the trial).

    • Waiting list control (WL; similar to NT, except patients know they will receive active treatment after the waiting phase).36

Outcome measures

Primary outcomes

  • Response (efficacy as a dichotomous outcome), for patients who did not respond to first-step treatment strategies but achieved response with next-step treatment strategies.

  • All-cause dropout (acceptability as a dichotomous outcome) for patients who left the trial or stopped the treatment early due to any reason up to the end of study duration.

Secondary outcomes

  • Change in severity of symptoms measured on the Hamilton or Montgomery-Asberg Depression Rating Scales or other depression rating scales. Extraction of continuous efficacy outcome data will be prioritised as proposed by Furukawa et al.37 Change scores will be used when end point scores are not reported.38

  • Remission, for patients who did not respond or did not achieve remission with first-step treatment strategies but achieved remission with next-step treatment strategies.

  • Drop-out due to adverse events (tolerability) measured as the proportion of patients who left the trial early due to any adverse events.

We will use the original author’s definition of ‘response’ and ‘remission’.

Trial duration

There is no consensus on the appropriate duration of an acute phase trial.39 40 Some newer treatments might show effects within one session.41 Nevertheless, the effect of trials should at least be evaluated after 4 weeks in order to determine stability of antidepressant effects. We will use the original author’s primary endpoint, ranging from 4 weeks or longer but less than 6 months, for analysis of the acute phase outcome data. We address long-term outcomes by additionally analysing the primary outcomes at a treatment duration of 6 months or longer, if these data are available.42 43 We will exclude studies from the statistical synthesis if no primary endpoint data for the 4+ weeks period can be provided.37

Comparability of dosages

We include fixed-dose and flexible-dose designs, and only include arms randomising patients to pharmacological, neuromodulation and psychological therapies within licensed doses and ranges of approved treatments, and any dosage or range of unapproved treatments. In case of psychotherapy, we require a minimum of 4 sessions, as this has been proposed as a minimally effective dose.44


We will not apply restrictions by type of setting.


We will apply no language restrictions.

Search strategy and data management

Search strategy

We will identify published, unpublished and ongoing RCTs that compared the efficacy and/or acceptability of one treatment strategy to another treatment or to a control condition in the treatment of TRD. The following sources will be searched: MEDLINE (Ovid), Cochrane Central Register of Controlled Trials (CENTRAL), Embase (Ovid), LILACS database and PsycINFO (Ovid). MEDLINE and Embase will be searched from 2019 onwards, as these are also indexed by CENTRAL. CENTRAL, LILACS and PsycINFO will be searched without date restrictions. Keywords for TRD and the RCT filter are based on the strategy used by Davies et al.45 See online supplemental appendix 3 for the MEDLINE search strategy, this strategy will be adapted to syntax and subject headings of other databases. We will search international trial registries ( and WHO International Clinical Trials Registry Platform). We will contact the National Institute for Clinical Excellence (UK), the Institut für Qualität und Wirtschaftlichkeit in Gesundheitswesen (Germany), check the websites of pharmaceutical companies to obtain unpublished information and contact their representatives. In addition, we will search references lists of included studies and recent systematic reviews.26–30 45–52

Relevant authors will be contacted to supplement published/unpublished studies or incomplete reporting, and reminded twice.

Study selection

Two investigators will independently review retrieved references and abstracts. Abstracts will be screened using the Rayyan web-application.53 A pilot will be conducted to refine screening policy of both reviewers. If both reviewers agree about a trial not meeting eligibility criteria, it will be excluded. We will obtain the full text of all remaining articles and use the same eligibility criteria to determine the final selection. Two independent reviewers will perform the selection and resolve disagreements via discussion with a third member of the review team.

Data extraction

Two reviewers will independently extract data and evaluate risk of bias for each selected trial. We will use a structured data extraction sheet, the use of which will be refined in a pilot period. Reliability of the data extraction will be checked. Information extracted will include trial characteristics (such as lead author, journal, publication year, design, inclusion criteria, sponsorship, number of recruitment centres, whether nonresponse was prospectively or retrospectively assessed, type and definition of non-response at time of enrollment (non-responder or non-remitter), whether non-response to psychological therapy was included in the TRD definition (a failed psychotherapy trial is classified as a failure to respond to an adequate course of 8 attended sessions of a form of psychotherapy with demonstrated effectiveness for MDD),7 definitions of response and remission), participant characteristics (such as diagnostic criteria for depression, depression severity threshold, participant age, gender distribution, setting, number of previously failed treatment trials in the current episode, length of current depressive episode, number of previous episodes, length of depressive disorder since age of onset, length of the previous treatment trial(s), depression severity at baseline, physical or psychiatric comorbidity), outcome measures and intervention details including cointerventions or continuation treatment. In case of pharmacological strategies we extract dosing schedule, dose ranges and mean doses of study drugs. For the antidepressant switching, we distinguish within or between class switches. In case of neuromodulation strategies we extract data on treatment protocols, mean number of treatment sessions, targeted sites and stimulation parameters. In case of psychological treatment strategies we extract type of psychotherapy, mean number of treatment sesssions, whether it concerned individual or group therapy, whether therapy was offered in a blended format or as partially self-guided therapy, and assessment of treatment integrity.

Level of TRD as inclusion criterion will be rated by two independent assessors. Reliablity of this assessment will be quantified. Disagreements in any of the extracted data will be resolved through discussion with a third member of the review team. We will contact corresponding authors if necessary, to obtain missing information.

Risk of bias assessment

Risk of bias of included studies will be assessed at outcome level for the two primary outcomes, using the Risk of Bias 2 tool described in the Cochrane Handbook for Systematic Reviews of Interventions.54 We will assess the following domains: bias arising from the randomisation process, bias due to deviations from intended interventions, bias due to missing outcome data, bias in measurement of the outcome and bias in selection of the reported result. Two independent raters will perform the assessment. If the raters disagree, the final rating will be made by consensus with the involvement of another member of the review group. We will contact corresponding authors if necessary, to obtain missing information. Overall risk of bias of each study will be categorised as follows: studies will be classified as having low risk of bias if all domains were rated at low risk of bias; some concerns if none were rated as high risk of bias but at least one domain raised some concerns; high risk of bias if at least one domain was rated at high risk of bias or multiple domains raise some concerns in a way that substantially lowers confidence in the results.

Statistical analysis

Synthesis of results

We will analyse the data using the meta55 and netmeta56 packages in R.57 Characteristics and findings of included studies will be presented in text and tables. We will analyse dichotomous outcomes on an intention-to-treat basis: all drop-outs from treatment will be assumed to have had negative outcomes (ie, non-response).

Pairwise meta-analysis

In order to answer our three clinical questions (see the Objectives section), we conduct three main analyses. The first clinical question relates to whether treating TRD with next-step treatment strategies is beneficial and/or safe. Via pairwise meta-analysis, we will obtain estimates of efficacy and acceptability of different treatment strategies, compared with both each other and control conditions. We will perform a random-effects meta-analysis on the eight types of next-step treatments as described in online supplemental appendix 2. For each pairwise comparison, we will synthesise data to obtain summary standardised mean differences (SMD, Hedges’ g) for continuous outcomes or ORs for dichotomous outcomes, both with 95% CIs.58 59

Network meta-analysis

The second clinical question we aim to answer is how various next-step treatment strategies compare with each other. We will conduct an NMA to examine comparative efficacy and acceptability of the next-step treatment strategies. In line with a previous protocol,37 we assume that patients who fulfil the inclusion criteria are equally likely to be randomised to any of the treatments that we plan to compare. If the collected studies appear to be sufficiently homogeneous with respect to distribution of effect modifiers (see Assessment of transitivity assumption section below), we will conduct a random effects NMA to synthesise all evidence for each outcome, and obtain a comprehensive ranking of all treatments. We will use arm-level data and the binomial likelihood for dichotomous outcomes. We will account for correlations induced by possible multiarmed studies by employing multivariate distributions. We will assume a single heterogeneity parameter for each network. We will present summary ORs or SMD for all pairwise comparisons in a league table. To rank the various treatments for each outcome, we will use the surface under the cumulative ranking curve and the mean ranks.

Meta-regression analysis of treatment resistance

In order to answer our third clinical question, we will perform meta-regression that evaluates the impact of different levels of TRD on the primary outcomes. TRD is defined as (1) the number of failed (antidepressant) treatment-trials (including augmentation and psychotherapy) that were required as inclusion criterion for the study8 or (2) dichotomised by slightly adapting Conway et al:7 TRD level I (failure of 1 or 2 adequate dose-duration antidepressants or psychotherapy from different classes (either in combination or succession)) or level II (Failure of ≥3 adequate antidepressant or psychotherapy trials from different classes (either in combination or succession)). If sufficient data are available, we aim to use the first, more detailed, grouping of TRD. If this proves unfeasible, we will employ the second definition.

Alternative geometry of treatment network structure

As described in online supplemental appendix 2, we aim to group treatments by presumed mechanism of action (eg, SSRI), and whether treatment was given as addition (augmentation) or replacement (switching) of the previous treatment. Similar to Carter et al29 we analyse the so-defined eight different types of treatment. Second, we aim to make detailed comparisons between individual treatments. We aim to reduce heterogeneity in treatment nodes as much as possible, depending on how much data will be available for analysis.60 We will not analyse the antidepressants in the ‘other’ subgroup at the subgroup level, due to the amount of heterogeneity we expect to arise from and lack of clinical relevance of grouping together this heterogeneous group of antidepressants (ie, we either include them in the general antidepressant group, or as individual antidepressants). In case of atypical antipsychotics, we account for differences in low or high doses, if possible.26 28 In case of neuromodulation treatment and psychological therapy, if the data does not allow for separate analysis for both switch strategies and augmentation strategies, these strategies will be (partially) clustered within a ‘mixed’ strategy. We will consider clustering the comparator interventions in ‘placebo’ (ie, pill placebo, psychological placebo and sham neuromodulation), ‘pharmacological control’ (ie, continuation of treatment and TAU) and ‘no treatment’ (ie, NT and WL) groups, if the groups are sufficiently homogeneous and consistent.

Assessment of heterogeneity (pairwise meta-analysis)

Comparable to Furukawa et al37 in the pairwise meta-analysis, we check the possibility of heterogeneity by visually inspecting the forest plots and compare the estimated value for the heterogeneity variance with the corresponding empirical distribution.61 Moreover, we report the I2 statistic with 95% CI,62 63 using the proposed thresholds in the Cochrane Handbook for interpretation (eg, 0%–40% might not be important, 30%–60% might represent moderate heterogeneity, 50%–90% might represent substantial heterogeneity, 75%–100% might represent considerable heterogeneity).54 In the NMA, we estimate the heterogeneity variance and compare it with the empirical distribution.

Assessment of the transitivity assumption (NMA)

We will investigate the distribution of clinical and methodological variables that can act as effect modifiers across treatment comparisons. We will examine levels of TRD as a possible violation of the transitivity assumption, as higher levels of TRD might be accompanied by differences in population characteristics.2 Clinical features which moderate efficacy of antidepressants include bipolarity64 and psychotic features.65 We assure transitivity regarding these variables by limiting our samples to participants with non-psychotic, unipolar depression. Other variables that may influence our primary outcomes include: age, depressive severity at baseline,66 67 dosing schedule68 and whether inclusion criteria of studies concerned non-response or non-remission. We will investigate whether these variables are similarly distributed across studies grouped by comparison. In order to account for the potential of placebo to violate the transitivity assumption, the comparability of placebo-controlled studies with those providing head-to-head evidence will be examined carefully.69 70

Assessment of inconsistency

We employ local and global methods to evaluate consistency of the network,71 using the node splitting approach72 and design-by-treatment interaction test,73 respectively. We evaluate consistency in the entire network by calculating the I2 for network heterogeneity, inconsistency and for both.73 74 Because tests for inconsistency are known to have low power,75 and 10% of evidence loops published in medical literature are expected to be inconsistent,76 we interpret statistical inference about inconsistency with caution; possible sources of inconsistency will be explored even in the absence of evidence for inconsistency.

Assessment of publication bias and small study effects

We use comparison-adjusted77 and contour enhanced78 funnel plots to investigate whether results in imprecise trials differ from those in more precise trials. We will run network meta-regression models to detect associations between study size and effect size.79

Exploring heterogeneity and sensitivity analyses

We will explore whether treatment effects for the two primary outcomes are robust in subgroup analyses and network meta-regression using the following characteristics:80 81

  1. Level of treatment resistance (see Meta-regression analysis of treatment resistance).

  2. Study year.

  3. Depression severity at baseline.

  4. Proportion of participants to be allocated to placebo.

  5. Number of recruiting centres (single centre vs multicentric studies).

Sensitivity of our conclusions for the two primary outcomes will be evaluated by analysing:

  1. Only studies with reported SD rather than imputed.

  2. Only studies that required at least two treatment trial failures in their definition of TRD.

  3. Only studies with a low risk of bias.

  4. Only studies with a prospective ascertainment of at least one treatment trial failure.

Grading of Recommendations Assessment, Development and Evaluation quality assessment

We will assess certainty of evidence contributing to network estimates of the primary outcomes by using Confidence in Network Meta-Analysis,82 and according to the Grading of Recommendations Assessment, Development and Evaluation framework.71

Patient and public involvement

No patients or members of the public will be involved in conducting this study.

Ethics and dissemination

This review does not require ethical approval. Findings will be submitted for publication in a peer-reviewed scientific journal. The data set will be made available.

Ethics statements

Patient consent for publication


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Twitter @And_Cipriani, @Toshi_FRKW

  • Contributors HGR is guarantor. JJM, PFPvE and HGR devised the study and drafted the protocol. JJM designed the search strategy. All authors contributed to the development of the selection criteria. ID, SvB, TAF and AC assisted in drafting the protocol. TAF and AC assisted in designing the study and provided statistical expertise. All authors read, provided feedback and approved the final manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Disclaimer The views expressed are those of the authors and not necessarily those of the UK National Health Service, the NIHR, or the UK Department of Health;

  • Competing interests JJM: None to declare. PFPvE reports grants from Zon. MW and speaking fees from Janssen outside of the submitted work. AC is supported by the National Institute for Health Research (NIHR) Oxford Cognitive Health Clinical Research Facility, by an NIHR Research Professorship (grant RP-2017-08-ST2-006), by the NIHR Oxford and Thames Valley Applied Research Collaboration and by the NIHR Oxford Health Biomedical Research Centre (grant BRC-1215-20005). He has received research and consultancy fees from INCiPiT (Italian Network for Paediatric Trials), CARIPLO Foundation and Angelini Pharma. ID: None to declare. SvB: None to declare. TAF reports grants and personal fees from Mitsubishi-Tanabe, personal fees from MSD, grants and personal fees from Shionogi, outside the submitted work; In addition, TAF has a patent 2020-548587 concerning smartphone CBT apps pending, and intellectual properties for Kokoro-app licensed to Mitsubishi-Tanabe. HGR reports grants from ZonMW, Hersenstichting, EU Horizon 2020 and speaking fees from Lundbeck and Janssen outside of the submitted work.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.