Article Text

Network meta-analysis for comparative effectiveness of treatments for chronic low back pain disorders: systematic review protocol
  1. Daniel L Belavy1,
  2. Ashish D Diwan2,
  3. Jon Ford3,4,
  4. Clint T Miller5,
  5. Andrew J Hahne4,
  6. Niamh Mundell6,
  7. Scott Tagliaferri6,
  8. Steven Bowe7,
  9. Hugo Pedder8,
  10. Tobias Saueressig9,
  11. Xiaohui Zhao10,
  12. Xiaolong Chen11,
  13. Arun Prasad Balasundaram4,
  14. Nitin Kumar Arora1,12,
  15. Patrick J Owen5
  1. 1Physiotherapy, Hochschule fur Gesundheit, Bochum, Nordrhein-Westfalen, Germany
  2. 2Department of Orthopaedic Surgery, Spine Service, St. George Hospital, Kogarah, New South Wales, Australia
  3. 3Advance Healthcare, Melbourne, Victoria, Australia
  4. 4Low Back Research Team, La Trobe University, Bundoora, Victoria, Australia
  5. 5Institute for Physical Activity and Nutrition (IPAN), Deakin University, Geelong, Victoria, Australia
  6. 6Deakin University, Geelong, Victoria, Australia
  7. 7Biostatistics Unit, Faculty of Health, Deakin University, Geelong, Victoria, Australia
  8. 8Bristol Medical School, University of Bristol, Bristol, UK
  9. 9Physio Meets Science GmbH, Leimen, Germany
  10. 10Xi’an University of Architecture and Technology, Xi'an, China
  11. 11Department of Orthopaedic Surgery, Spine Service, University of New South Wales, Sydney, New South Wales, Australia
  12. 12Centre for Physiotherapy and Rehabilitation Sciences, Jamia Millia Islamia, New Delhi, India
  1. Correspondence to Professor Daniel L Belavy; belavy{at}


Introduction Chronic low back pain disorders (CLBDs) present a substantial societal burden; however, optimal treatment remains debated. To date, pairwise and network meta-analyses have evaluated individual treatment modes, yet a comparison of a wide range of common treatments is required to evaluate their relative effectiveness. Using network meta-analysis, we aim to evaluate the effectiveness of treatments (acupuncture, education or advice, electrophysical agents, exercise, manual therapies/manipulation, massage, the McKenzie method, pharmacotherapy, psychological therapies, surgery, epidural injections, percutaneous treatments, traction, physical therapy, multidisciplinary pain management, placebo, ‘usual care’ and/or no treatment) on pain intensity, disability and/or mental health in patients with CLBDs.

Methods and analysis Six electronic databases and reference lists of 285 prior systematic reviews were searched. Eligible studies will be randomised controlled/clinical trials (including cross-over and cluster designs) that examine individual treatments or treatment combinations in adult patients with CLBDs. Studies must be published in English, German or Chinese as a full-journal publication in a peer-reviewed journal. A narrative approach will be used to synthesise and report qualitative and quantitative data, and, where feasible, network meta-analyses will be performed. Reporting of the review will be informed by Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) guidance, including the network meta-analysis extension (PRISMA-NMA). The Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach for network meta-analysis will be implemented for assessing the quality of the findings.

Ethics and dissemination Ethical approval is not required for this systematic review of the published data. Findings will be disseminated via peer-reviewed publication.

PROSPERO registration number PROSPERO registration number CRD42020182039.

  • spine
  • back pain
  • rehabilitation medicine
  • pain management

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • This study will enable comparison of a wide variety of treatments for chronic low back disorders via network meta-analysis.

  • Our study will provide evidence that can be applied in clinical practice and in low back pain management guidelines.

  • The quality of evidence will be assessed via the Grading of Recommendations Assessment, Development and Evaluation (GRADE).

  • We will address the potential limitation of heterogeneous pathologies being combined into one population by performing subgroup analyses.

  • Baseline pain and disability are known to be predictive of outcome and we will account for this in the analysis.


Low back pain is the greatest cause of disability and lost productivity worldwide.1 In developed regions, such as the USA, Japan, Europe and Australia, the disease generates substantial financial costs.2 For example, healthcare expenditure is in excess of $A5 billion per year in Australia3 and US$100 billion per year in the USA.3 The majority of acute cases of back pain resolve without specific intervention,4 yet chronic low back pain disorders (CLBDs; i e, >12 weeks duration) generate the greatest proportion of economic burden5 and affect 20.1%±9.8% of the population worldwide.6 To reduce the global burden of disease of CLBDs, identifying and implementing the most effective treatment are urgent priority.7

To date, pairwise meta-analyses have typically been used to evaluate individual treatment modes for CLBDs.8 Current recommendations include education, exercise, manual therapy, psychotherapy and multidisciplinary interventions.8 9 A comparison of a wide range of common treatments and their relative effectiveness for CLBDs is yet to be performed. This evidence would inform management guidelines and clinical decision making. These data would also increase the likelihood that patients receive the most efficacious treatment and/or avoid therapies with similar effectiveness but greater harms. Collectively, this would reduce financial burden at the societal level, as well as improve patient outcomes at the individual level.

Network meta-analysis (NMA) permits the ranking of a series of interventions as comparably more or less effective.10 11 NMA can incorporate data on multiple treatments simultaneously from randomised controlled trials (RCTs) that do not have similar comparator groups by synthesising direct and indirect evidence from a ‘network’ of studies.11–13 This overcomes a key limitation for pairwise meta-analysis and allows RCTs that do not have a non-treatment or minimal treatment control group to be included in the analysis.14 NMA has been used to examine the relative effectiveness of exercise training modalities in non-specific chronic low back pain,15 exercise and education for back pain prevention,16 treatments for lumbar disc herniation17 and medication for sciatica.18 However, this approach has not been considered simultaneously for a wide range of common treatments of CLBDs. In this study, we will examine CLBDs, encompassing radicular syndromes and non-specific low back pain.19 Our primary aim is to determine the relative effectiveness of a variety of common treatments for CLBDs via NMA.


This systematic review will be conducted in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)20 and the PRISMA extension for network meta-analyses (PRISMA-NMA).21 Covidence ( will be used for article screening and data extraction. This systematic review was prospectively registered on PROSPERO (CRD42020182039, submitted 24 April 2020) prior to initiating data extraction. We will use the PRISMA-P checklist when writing our report.22

Eligibility criteria

For inclusion, studies will be required to be full peer-reviewed publications (ie, grey literature including theses and conference abstracts will be excluded) in English, German or Chinese. A meta-epidemiological study by Nussbaumer-Streit et al23 found that when non-English studies were excluded from systematic reviews of clinical interventions, this had little impact on study conclusions. Furthermore, Cochrane guidelines24 are ambivalent on the inclusion of non-English language articles and the potential for introduction of bias in reviews. Prior work has suggested that inclusion or exclusion of non-English articles does not influence the effect estimates, but may narrow CIs.25 We pragmatically chose to include articles in languages in which the author team were fluent. All other inclusion criteria followed the Participants, Interventions, Comparators, Outcomes and Study design (PICOS) framework.21


Adults (≥18 years) with CLBDs. Chronic is defined as pain lasting 12 weeks or more.26 Since not all studies are consistent in their reporting of pain duration, we will use the following approach: if a study defines it collectively as ‘chronic’, then it will be included. Failing this, if the inclusion criteria of the study are minimum of 12 weeks’ pain duration or if the median or mean reported duration of pain at baseline in participants is 12 weeks or more, then the study will be included. Recurrent pain (ie, <12 weeks’ duration of symptoms and pain-free period of at least 6 months27) is excluded. Low back disorder is defined as back pain with or without leg pain where there are no specific spinal pathologies (ie, vertebral fracture, malignancy, spinal infection, axial spondyloarthritis, cauda equina syndrome19). Spondylolisthesis, spondylosis, disc herniation, disc degeneration, scoliosis, deformity (eg, hemivertebrae) and radicular syndromes (eg, radicular pain (leg pain or sciatica), radiculopathy, spinal stenosis) are included.19 ‘Failed back surgery syndrome’ is included as this is not a specific disease.28 If a study only examines post-surgical pain (eg, a comparison of management for immediate post-surgical pain as an RCT), we will consider this iatrogenic pain and the study will be excluded.

Interventions and comparators

The treatment types to be included were determined by the current clinical practice guideline from the American College of Physicians29 and by the review areas of the Cochrane Back and Neck Group.30 A detailed list is included in online supplemental data A; however, in brief, we examined acupuncture, education or advice, electrotherapy (including heat and ice electrotherapeutic modalities applied non-invasively), epidural injections, exercise training, manual therapies/manipulation, massage, the McKenzie method, pharmacotherapy, psychological therapies, percutaneous procedures, surgery, traction, physical therapy (otherwise not falling into specific treatment combination), placebo, multidisciplinary pain management, usual care (eg, general practitioner management), no treatment (true control). Treatment combinations will be considered pending data availability and defined according to their component parts (see online supplemental data A for details) for primary and secondary treatment components. Pending articles included in the review, further subgroup classifications will be considered.


Pain intensity (eg, VAS, NRS, McGill Pain Questionnaire, or Box scale, other quantitative pain measures), disability (eg, ODI, RMDQ), mental health (eg, SF-36 MH subscale, depression, anxiety). Adverse events, participant drop-outs and funding sources will be extracted from the included articles.

Study design

RCTs, randomised clinical trials, randomised controlled cluster trials or randomised cross-over trials will be included.

Search strategy

Six databases (MEDLINE, SPORTDiscus, CINAHL, PsycINFO, EMBASE, CENTRAL) were searched with no restriction on publication dates. The search was initially performed from inception to 14 November 2019 and then was updated on 24 July 2020. Search terms were to find articles on (1) low back disorders and (2) RCTs (online supplemental data B). Low back disorder terms included those recommended by the Cochrane Back and Neck review group31 for non-specific back pain and radicular syndromes.19 The search terms for identifying RCTs were modelled on Cochrane sensitivity-maximising and precision-maximising search terms to be consistent across databases. Prior systematic reviews in English of any kind of treatment for chronic low back disorders in the last 10 years were screened via a search (January 1990 to July 2019) of MEDLINE, SPORTDiscus, CINAHL, PsycINFO, EMBASE and CENTRAL. Collectively, 285 such systematic reviews were identified. The complete reference lists of these reviews were collated and then screened to remove non-RCTs. Subsequently, 1783 additional references were identified, and after uploading to Covidence, 1008 duplicates were removed, leaving 775 new titles/abstracts. Furthermore, the reference lists of 17 relevant Cochrane reviews not published between January 1990 and July 2019 were screened: 269 additional references were added after discarding 394 duplicates. Following removal of duplicates, a total of 19 522 articles remained for screening.

Study selection

For each record, two independent assessors will screen the studies against the predetermined inclusion/exclusion criteria. Disagreements that cannot be resolved among the assessors will be addressed by an adjudicator. If unsure, the adjudicator will discuss with the broader study team. If still unsure, the study authors will be contacted for clarity. The process for determining study inclusion/exclusion is shown in figure 1.

Figure 1

The process for determining study inclusion/exclusion.

Data extraction

For each record, two independent assessors will extract the data. Disagreements that cannot be resolved among the assessors will be addressed by an adjudicator. Relevant information pertaining to publication metadata (ie, author, title, year, journal), study design (eg, two-arm or multi-arm parallel trial), number of participants, participant characteristics (eg, age and sex), interventions considered and outcome measures (pain, disability, mental health, adverse events and funding sources) will be extracted by two independent assessors. Extracted outcome data (pain, disability, mental health) will be pre-intervention and post-intervention mean and SD. When available, data will be extracted for the following time-points: immediate (<1 day) effect of treatment, short-term (≥1 day but <3 months), intermediate-term (≥3 but<12 months), long-term (≥12 months). Primary and any secondary intervention components will be labelled as per the protocol described in online supplemental data A.

Data presented as medians or alternate measures of spread will be converted to mean and SD using established formulae.32 When only figures are presented (rather than numerical data within text), data will be extracted using ImageJ ( to measure the length (in pixels) of the axes to calibrate, and then the length in pixels of the data points of interest.33 When it is not possible to extract the required data, this information will be requested from the authors at a minimum of three times over a 4-week period. Prior to commencing data extraction, this method will be piloted on 30 studies chosen at random. All discrepancies will be referred to an adjudicator.

Due to the volume of potentially included articles, for each study, information on the population (type of low back pain (non-specific or radicular pain), and subpopulation (eg, ‘non-specific’, ‘low back pain not otherwise stated’, ‘disc degeneration’, ‘spondylolisthesis’, ‘spinal stenosis’, ‘radiculopathy’, ‘radicular pain’)) and intervention/comparator (intervention duration, free text entry of description of interventions, study-arm labels, primary and secondary intervention classifications (if relevant); see online supplemental data A) will be extracted first. Then, studies that examine different treatment classes (eg, exercise vs control, psychological therapies vs exercise, or surgery vs percutaneous therapies; see online supplemental data A) will be included in subsequent extraction and the remaining studies excluded. This approach will be undertaken because our primary research question concerns different classes of treatments; hence, studies that compare the same class of treatment (eg, exercise vs exercise, or surgery vs surgery) are less informative for this question.

Risk of bias

Two independent assessors will use the Cochrane Collaboration Risk of Bias34 to examine potential selection bias (random sequence generation and allocation concealment), performance bias (blinding of patients and personnel), detection bias (blinding of outcome assessment), attrition bias (incomplete outcome data), reporting bias (selective outcome reporting) and other biases. Cluster randomised trials will be assessed as recommended by the Cochrane Collaboration.35 The revised version of the risk of bias tool36 will not be used as it was, at initiation of the project, not yet recommended by the Cochrane Collaboration. For each source of bias, studies will be classified as having a low, high or unclear (if reporting was not sufficient to assess a particular domain) risk. All discrepancies will be referred to an adjudicator.

Two independent assessors will use the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach for NMA for assessing the quality of the evidence (online supplemental data C). We will use a range of equivalence of standardised mean difference (SMD), from −0.5 to 0.5, to evaluate imprecision and inconsistency.37 Publication bias will be assessed via statistical and non-statistical methods.38 Indirectness will be judged using Schünemann’s approach.39 Risk of bias will be downgraded by one level if >50% of participants were from studies with selection bias and performance bias. This criterion was selected because inadequate randomisation and lack of blinding may lead to an exaggeration of the intervention effect estimates.40–42 For the categories ‘imprecision’ and ‘inconsistency’, we will downgrade by one level if there are some concerns and two levels if there are major concerns. Indirectness will be downgraded by one level if deemed serious and two levels if deemed very serious. We will downgrade one level if publication bias is suspected. The GRADE approach43 44 will also be used to assess the quality of the evidence of pair-wise comparisons. All discrepancies will be referred to an adjudicator.

Statistical analyses

When studies are reverse scaled (ie, higher values indicated better outcomes rather than lower values), the mean in each group will be multiplied by −1 as recommended in the Cochrane Handbook. As all of the outcomes of interest will be continuous or ordinal, but could be measured on different scales, SMD will be used as the effect estimates.45 A minimum of 50 participants will be required per class of treatment for it to be included in meta-analysis. We have limited the number of participants to try to limit the impacts of small study effects on the results of any particular class.45 Furthermore, because we are conducting an analysis of SMDs, small study effects are likely to be exacerbated as both the mean and the SD are likely to be estimated with greater variability in small studies, and for SMD both of these contribute to the treatment effect. To further investigate our choice of SMD as an effect measure, we will conduct sensitivity analyses with internal reference baseline SDs for each scale.46

Where a study does not report data in a form where the SD can be extracted or calculated,32 and authors are not able to fulfil data requests, SDs will be imputed and their impact evaluated in sensitivity analyses. To impute missing SDs, we will perform a regression of log(SD) on log(mean) in studies reporting SD following the approach of Marinho et al47, adjusting for measurement scale and follow-up time. We will then use this regression model to predict SDs that are missing.

Cluster randomised trials will be included in the analysis as per Cochrane guidance. Sensitivity analysis will be conducted in pairwise analyses with a range of different intraclass correlation co-efficients (ICCs) to check the robustness of the results.48 For crossover trial designs, we will include the estimated relative treatment effect from the study where possible, where the authors have tested for carryover effects and found no evidence of this. Where this is not the case, we will only include the first period of the crossover trial. In time-course Model-Based Network Meta Analyses (MBNMA), only the inclusion of the first time-period will be possible.

Network meta-analysis

Bayesian NMA will be performed at discrete time-points (immediate (<1 day) effect of treatment, short-term (≥1 day but <3 months), intermediate-term (≥3 but<12 months) and long-term (≥12 months)) using the R ( package multinma.49 Time-course MBNMA will be conducted using the R package MBNMAtime.50 51 This package enables the incorporation of multiple time-points per study in Bayesian NMA to inform estimates of effect size over time. Network connectivity will be explored via network plots. Network plots help to visualise how the evidence in the network is connected and allow identification of which studies compare which treatments. This aids in understanding which treatment effects can be estimated. The time-course relationship will be examined by a time plot, which is a plot of the raw study responses over time. Time plots help to elucidate the underlying time course of the treatment effects and help to identify which statistical time model is appropriate.

Where data allow and where there is a plausible clinical reason for doing so, treatment effects will be assumed to be common or exchangeable within a class. This allows for treatments to be nested within a class, which relaxes assumptions regarding the similarity of interventions while improving network connectivity.13 We will use the deviance information criterion to compare the different models (common/exchangeable class effect models, time-course models) to assess their parsimony.52

For standard NMA models we will rank the relative effects of each treatment/class, and for time-course MBNMA models we will rank the relative effects of each treatment/class for each time-course parameter. We will also rank the full area under the time-course function for each treatment/class at 0–3 months, 0–6 months and 0–12 months. Cumulative rankograms will be plotted; these show the range of rankings of different treatments/classes for each ranked parameter. Sensitivity of model results to the choice of prior distributions will be investigated.

Assessing key assumptions of pairwise and NMA

The authors recommended a strong and rigorous focus on the evaluation of the similarity and homogeneity assumptions.

Assessment of similarity and homogeneity assumptions

A qualitative assessment of the clinical similarity of the different populations and treatments will be performed by important variables such as baseline pain intensity, baseline disability and pain duration. Between-study SD will be estimated and reported from random effects models, and the impacts of subgrouping or meta-regression on this will be examined. Pair-wise meta-analysis of data will be synthesised via SMDs with accompanying 95% CIs using a frequentist random effects model with a restricted maximum likelihood estimator for the between-study variance Tau². These analyses will be carried out with the R package ‘metafor’.53 Visual inspection of the forest plots, statistical estimates of heterogeneity (I², Tau) and 95% prediction intervals will be used to assess the validity of homogeneity assumptions. Small study effects and publication bias will be assessed for each pairwise comparison by visual inspection of the contour-enhanced funnel plot. Outlier and influential study analysis will be performed with metafor for pairwise meta-analyses to further detect potential heterogeneity.54 Meta-regression with potential effect modifiers (pre-intervention pain severity and disability, baseline psychological conditions, presence of co-interventions and type of low back pain)55–57 will be used to further check for potential heterogeneity among the pairwise comparisons.58

In the presence of effect modification in pairwise comparisons (identified using meta-regression), we will also fit network meta-regression with these potential effect modifiers for NMAs conducted at each time-point using the package multinma.49

Consistency assumptions

For the Bayesian approach, consistency assumptions will be first checked via an unrelated mean effects (UME) model, which does not assume consistency.59 The UME model only synthesises direct relative effects between each arm in a study and the study reference treatment. If the consistency assumption holds, then the results from the UME and NMA models will be similar. Changes in between-study SD or residual deviance are also suggestive of inconsistency. If comparison between UME and NMA models is suggestive of inconsistency, node-splitting will be performed.60 In node-splitting, network contrasts are split into direct and indirect evidence contributions, which can then be compared with examine their similarity.

Additional assumptions required for analysis of time-course data

Given that data will be reported at different follow-up times in different studies, information is unlikely to be available for all treatments at all time-points of interest. For this reason, additional assumptions regarding specific parameters for treatments/classes may be required. For example, in the case of a treatment for which information is only available at shorter follow-up times, explicit assumptions regarding its long-term efficacy will be required. The treatment’s long-term efficacy could be assumed to be the same as (or similar to) that of another treatment in the network that might have a similar mechanism of action (eg, within the same class), for which long-term data is available. Alternatively, it could be assigned a specific value or an informative prior as determined by clinical expertise. In such an example, long-term results for this treatment will therefore be sensitive to these assumptions, and results will be interpreted accordingly.51 Assumptions made in this way will be clearly stated and justified.

Subgroup and sensitivity analyses

Pending data availability, we will perform subgroup analyses to explore whether inconsistency/heterogeneity and group differences in the outcomes are influenced by type of low back disorder (eg, non-specific chronic low back pain, radicular syndrome), type of treatment (eg, surgical, pharmacological) or by exclusion of the multidisciplinary node and the physical therapy (otherwise not falling into specific treatment combination) node from analyses. The treatment node may be a source of significant heterogeneity/inconsistency for the overall NMA due to the variability of this treatment definition compared with other interventions. Subgroup analysis focussing on key participant or study characteristics can produce smaller, more homogenous networks and can be a good strategy to analyse inconsistency/heterogeneity with fewer assumptions and pitfalls then NMA meta-regression.61 If we are unable to identify the source of inconsistency, we will highlight that this limits the usefulness of the analysis for drawing meaningful conclusions in such a heterogeneous population.

Further, pending data-availability, we will consider the following sensitivity analyses

  • Excluding studies with imputed missing SD and imputed medians.

  • Study sample size: impact of studies including less than 20 participants in all study-arms.

  • Dropout numbers and handling of dropouts within studies: the impact of the proportion of dropouts (if reported) and the kind of analysis individual studies performed (eg, analysing all participants using imputation of missing data vs analysing complete cases only).

  • Comparison of class effect models to a model with fully independent treatment effects that assume no within-class similarity, to assess the statistical validity of class assumptions.

  • Secondary treatment components (see online supplemental data A): the impact of treatment combinations where secondary classes of treatment are present in all arms will be considered by fitting models that incorporate combinations as different nodes in the network. This can be used to assess the assumption of additivity of combined treatments. We will also investigate the impact of ordering of primary/secondary treatment components by fitting a model in which the order is ignored (eg, ‘Physical therapy +massage’ assumed to be equivalent to ‘Massage +physical therapy’).

  • Secondary treatment components (see online supplemental data A): the impact on effect estimates of when secondary treatments are included will be assessed via a sensitivity analysis excluding those interventions with a secondary treatment component.

  • As some osteopathic interventions may include visceral techniques not declared in the original methods of the study, the impact of removing this from the manual therapy node will be examined.

  • Excluding unclear generic nodes (eg, physical therapy otherwise not falling into specific treatment combination)

  • Risk of bias: To examine the influence of specific studies/comparisons on the treatment rankings we will conduct a threshold analysis where possible51 using the R package nmathresh.

  • Choice of SMD as an effect measure by using internal reference baseline SDs for analysis.46


This NMA will determine the relative effectiveness of a variety of common treatments for CLBDs. Conducting NMA on this topic constitutes a shift towards the highest level of medical evidence.62 Our NMA has a much broader scope than prior work, such as that concerned solely with pharmacotherapy,63–66 exercise training,15 67 68 traditional Chinese medicine69 or psychotherapy.70 Moreover, the broad inclusion criteria and number of interventions considered in our NMA will result in a greater number of included interventions than previous broad NMAs that examined non-pharmacotherapy71 and surgery-based interventions,72 which included 31 and 12 interventions, respectively. The breadth of our NMA is important given that CLBDs are inherently heterogenous, yet progenitors do not influence decision making regarding treatment sought.8 For this reason, CLBDs (excluding specific causes) are commonly treated in line with generic clinical guidelines.73 This underpins the importance of our NMA, as these guidelines do not distinguish whether one treatment is superior to another for this collective of patients with chronic pain. Given the lack of evidence that treatment efficacy differs by underlying pain progenitor, we believe it is reasonable to assume exchangeability of these studies and transitivity within the network in terms of population. Other than recent suggestions that machine learning74 may one day identify evidence-based subgroups that respond ‘better’ to specific treatments, we surmise that our NMA will markedly contribute to overcoming current limitations in the management of CLBDs pertaining to treatment decision making.

To our knowledge, there is only one other NMA currently being conducted with a similar scope to our protocol.75 Our NMA overcomes several cardinal limitations of this protocol: (1) we consider CLBD, rather than solely non-specific low back pain; (2) we consider additional languages for article inclusion, rather than English only; and (3) our treatment classification is more nuanced, rather than simplistic (eg, the other protocol typically considers two types of treatment within a particular class). Of note, we registered our systematic review prior to publication of this other protocol, and it is unclear when their work is due to be published.

Despite the many strengths of our proposed NMA, we would be remiss not to acknowledge potential limitations. First, due to the inclusion of radicular syndromes in the patient population, it might be necessary to analyse this population in different networks/subsets because the presence of this may be an effect modifier76 and lead to intransitivity. Second, we do not consider multicomponent interventions in our statistical model, which might have an impact on the estimates.77 78 By ignoring additional treatment components given in both arms of included studies, we assume additivity of different treatment components. While we will investigate the effects of this (see Sensitivity Analyses), fully accounting for it by modelling all combinations of treatments as separate interventions is likely to lead to disconnected networks of evidence, which poses its own problem for evidence synthesis and decision making.79 Third, while we propose a variety of subgroup analyses to investigate the impact of effect modification, potential effect modifiers may be poorly reported in many studies. However, there is no clear evidence of important effect modification in CLBD to date. As pointed out in the recent Lancet Low Back Pain Series,8 relative treatment efficacy for different kinds of interventions appears (to date) to be surprisingly similar. Fourth, usual care may vary between included studies (eg, authors’ stance on whether or not usual analgesic pharmacotherapy was permitted), yet given few studies in the CLBD field employ methods of strict observation, we surmise that the majority, if not all, of existing studies are inherently at risk of this form of bias.

Finally, as with all meta-analyses, dealing with co-interventions has implicit complexities. Our decision to consider interventions that combine multiple forms of interventions of interest may impede our capacity to differentiate the effects of one individual treatment. However, we contend that this approach allows for the inclusion of more trials that, when compared with a strict approach that excluded any interventions with co-intervention, reflects more realistically the realities of clinical practice. This, in our view, leads to less potential bias (eg, inclusion of studies that simply failed to report co-interventions) and greater confidence in our effect estimates.

In conclusion, the current project will enable a significant advance in synthesising knowledge on the comparative effectiveness of a wide variety of treatments for chronic low back disorders. This has, to date, not been performed and will inform patient management and clinical practice guidelines.

Ethics statements

Patient consent for publication


Supplementary materials


  • Twitter @ScottTags, @nitintarika

  • Contributors Study conception: DB, ADD, JF, PJO, CTM, AJH, SB; steering committee: DB, AD, JF, PJO, CTM, AJH; statistical planning, implementation, advice: DB, SB, HP, TS, ST, AJH; adjudication in screening: CTM, NM, ST, DB; screening/data extraction: XZ, XC, APB, NKA; drafting manuscript: DB, PJO; approving final version of manuscript: all.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.