Objectives To review the reporting of monitoring and implementation of interventions in a selection of trials that assessed the effectiveness of manual therapy and exercise in the management of shoulder subacromial pain.
Design A review of trials assessing the effectiveness of manual therapy and exercise in the management of patients with shoulder subacromial pain.
Methods We included in our review a selection of 10 trials that were included in a Cochrane review and compared manual therapy and exercise intervention with another intervention. Trials were assessed independently by two reviewers using two checklists: the Template for Intervention Description and Replication (TIDieR) and the Health Behavior Change Consortium treatment fidelity (National Institutes of Health Behaviour Change Consortium/NIHBCC).
Results TIDieR overall scores for individual trials ranged from 11.1% to 45% and fidelity scores ranged from 7% to 50%. On average, trials scored the following within each domain of NIHBCC: study design 51%; training of providers 8%; treatment delivery 15%; treatment receipt 14% and treatment enactment 2.5%.
Conclusions Little information about the monitoring, implementation and reporting of interventions was provided by trials and that is a barrier for implementing or replicating these interventions. The lack of information regarding the implementation of interventions needs to be taken into account when assessing whether effectiveness of interventions was impacted by their design or due to deviations from the protocol within trials.
- musculoskeletal disorders
- clinical trials
- rehabilitation medicine
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
The Template for Intervention Description and Replication and National Institutes of Health Behaviour Change Consortium checklists were used to gather information about monitoring and implementation of interventions within trials.
Some items from these checklists were considered as ‘not applicable’ for certain trials and all items received equal weighting when calculating the overall fidelity score.
The active elements of an intervention should have a larger weight on the fidelity score. Our analysis did not take that into consideration.
Our study thoroughly screened how interventions were monitored and implemented within a selection of trials.
Shoulder pain is a very common musculoskeletal complaint. It has a 1-year prevalence of 18.1%,1 and high socioeconomic burden.2 In New Zealand, the Accident Compensation Corporation spent approximately $14 million per year for covering rehabilitation for shoulder injuries.3 Shoulder subacromial pain is defined as pain at the top and lateral part of the shoulder joint, that may spread to the neck and elbow, and is worsen by overhead activities.4 Shoulder subacromial pain can be difficult to manage, patients present slow recovery,5 with only 50% of new episodes presenting full recovery within 6 months.6
Physiotherapy interventions are considered complex interventions and there are challenges to test, implement, report and evaluate their effectiveness.7 8 One current limitation within musculoskeletal rehabilitation is that few trials conduct process evaluation studies alongside the outcome evaluation trial or report sufficient information regarding the monitoring and implementation of intervention within trials.9 That has implications for the way we interpret findings from trials and also limits the translation of those tested interventions into healthcare services.
Process evaluation of trials allows researchers to assess the implementation of an intervention, the mechanisms of impact of an intervention and the context within which an intervention is delivered.10 Data from process evaluation analyses inform what works for whom, why, how and under which circumstances an intervention works.7 11 This information is essential to allow not only replication of a trial by researchers but also its implementation by clinicians and policymakers.11 Implementation-based process evaluation assesses the monitoring and implementation of interventions within trials, providing information about what and how an intervention was implemented in a trial.10 The key elements of implementation-based process evaluation are treatment fidelity, reach and dose.10 11 Treatment fidelity refers to the extent to which an intervention is delivered as planned and the extent to which it is different from other intervention arms (eg, control, usual care).12 Reach refers to the extent and how the intended audience took part in the study. Reach depends on the context in which an intervention is delivered and can be assessed at an individual or environmental level.11 For individually focused interventions, reach can be interpreted as the proportion of individuals within the possible population who received the intervention or where exposed to elements of the intervention; at an environmental level, reach can be interpreted at the organisational level assuming that individuals spend most of their time in that particular setting.11 Finally, dose refers to the amount of intervention provided in a trial11 and can be assessed through: dose delivered and dose received. Dose delivered refers to number or amount of intended units of an intervention, while dose received refers to the extent to which participants engage or interact with the intervention (eg, materials, resources). Implementation findings can inform whether an intervention failed to achieve its clinical outcomes due to flaws on its design, or due to clinicians and participants not adhering to the protocol as planned.13 14 Without information about monitoring and implementation of interventions, there is a risk of underestimating or overestimating the effect of an intervention as per the protocol.
Reporting guidelines such as Consolidated Standards of Reporting Trials (CONSORT),15 Template for Intervention Description and Replication (TIDieR)16 and National Institutes of Health Behaviour Change Consortium (NIHBCC)17 were developed to improve clarity and quality of trial reporting of interventions. The CONSORT checklist was designed to improve reporting of trials and gained a number of extensions, including the TIDieR checklist,16 which includes a number of items that are focused on the implementation of complex interventions within a trial, covering information about context, fidelity and dose.16 The NIHBCC checklist was developed for enhancing the reporting of fidelity of behavioural change interventions.17 There is some overlap between the TIDieR and the NIHBCC checklists, and both are applicable for assessing the reporting of monitoring and implementation of physical therapy complex interventions in trials.
A recent umbrella review18 identified six systematic reviews assessing the effectiveness of exercise and manual therapy for the management of subacromial shoulder pain.19–24 Those six systematic reviews19–23 presented slightly different conclusions. One problem is that those previous reviews did not comprehensively assess how implementation of interventions was assessed or reported in included trials. The aim of this study was to review the reporting of implementation of interventions in a selection of trials that assessed the effectiveness of manual therapy and exercise in the management of shoulder subacromial pain.
Patient and public involvement
Patients or the public were not involved in the design, conduct, reporting or dissemination plans of this study.
There have been a number of reviews summarising the effect of exercise therapy, manual therapy or both on clinical outcomes in patients with shoulder pain.18–24 Those reviews have not always presented the same conclusions or recommendations. For example, four reviews suggested that manual therapy and exercise reduced pain in the short term, while one review24 suggested there was limited evidence that manual therapy and exercise were more effective than placebo for the management of patients with shoulder subacromial pain. One umbrella review18 concluded there is evidence supporting exercise therapy and manual therapy (particularly at early stages) when managing patients with shoulder subacromial pain. None of those reviews analysed or discussed the implementation of interventions tested within the trials. The present study assessed trials included in previous systematic19–24 and umbrella18 reviews that compared exercise and manual therapy for the management of patients with shoulder subacromial pain. Given the different methods used by these previous reviews, we followed the method adopted by the Cochrane Review,24 which is arguably the gold standard, for estimating the treatment effect of manual therapy and exercise when compared with another form of intervention (ie, control, placebo or another active intervention).
Identification and selection of articles
We included trials that were reported by previous reviews18–24 that compared the effect of manual therapy and exercise with another form of intervention (ie, control, placebo or another active intervention) in patients with shoulder subacromial pain.
To obtain information about implementation of interventions within trials, we focused on two outcomes: reporting of interventions, as assessed through the TIDieR checklist16 and the modified NIHBCC.17 25 The TIDieR checklist provides some insight into how implementation of interventions was reported within trials. The use of NIHBCC checklist provides insight into how treatment fidelity was reported within trials. Each item from these checklists was assessed using the following criteria: reported, partially reported, not reported.
The TIDieR checklist was designed to improve reporting of interventions in clinical trials.16 26 It consists of 12 items covering the following domains: (1) brief name, (2) why, (3) what materials, (4) what procedures, (5) who provided, (6) how, (7) where, (8) when and how much, (9) tailoring, (10) modifications, (11) how well (planned) and (12) how well (actual).16
The NIHBCC checklist covers five domains: (1) study design; (2) training of providers; (3) treatment delivery; (4) treatment receipt and (5) treatment enactment. The checklist has a total of 40 items.17 The NIHBCC checklist was designed for assessing fidelity of two-arm trials. In our study, we analysed some trials with more than two treatment arms and, for that reason, we adapted the NIHBCC checklist by duplicating item 2, which covers information about treatment dose within an arm of the trial. Hence, trials with three arms had a total of 44 items. Similar approach was used in a previous study assessing fidelity of treatment within physical therapy interventions.25
Two reviewers extracted data independently from those 10 trials. Data extraction was based on a content analysis using predefined categories according to the TIDieR and NIHBCC checklists.16 17 25 The content analysis is subjective, and to minimise bias, two reviewers analysed the reporting independently. This approach has been used in previous studies.27 28 Disparities between reviewers were resolved by consensus. All data extracted were cross-checked by a second reviewer.
We used descriptive statistics for summarising findings regarding the reporting of trials, considering the TIDieR and NIHBCC checklists. We calculated a summary score for each checklist (NIHBCC and TIDieR) for individual studies. Items were scored using the following criteria: reported (2 points), partially reported (1 point) and not reported (0 point). This scoring system was recommended and used by previous studies for the TIDieR checklist.27 28 For the TIDieR checklist, we summarised the number of studies that presented a full, partial or no report for each domain. When calculating the scores for NIHBCC checklist, we also calculated the percentage score, defined as the score allocated divided by the total applicable score for each domain, per individual study.
The characteristics of included studies are displayed in table 1.
The TIDieR overall score for each study is presented in table 2, with scores ranging from 8 to 17 out of 24.
The percentage of studies reporting information regarding items from TIDieR checklist is presented in table 3. All trials provided information regarding item 1 (name or phrase describing an intervention). Most items were partially reported by studies. Fifty per cent of trials provided no information about item 9 (ie, tailoring of intervention) (table 3).
The overall fidelity score for each study is presented in table 4. Overall fidelity scores ranged from 9% to 56%. Considering the five domains (ie, study design, training of providers, treatment of delivery domain, treatment receipt and treatment enactment), most studies reported some information regarding items from the ‘study design’ domain. Very limited information was provided about ‘training of providers’, ‘treatment delivery’, ‘treatment receipt’ and ‘treatment enactment’ domains.
This review assessed the reporting of monitoring and implementation of interventions in a selection of trials that assessed the effectiveness of manual therapy and exercise in the management of shoulder subacromial pain and were included in recent systematic reviews19–24 and an umbrella review.18 Our findings revealed that most trials did not provide sufficient information about how interventions were implemented within a trial nor what was the implementation fidelity within those trials. Information about monitoring and implementation of interventions is important for assessing whether an intervention causes improvement or not on clinical outcomes, it can help to identify contextual factors which may influence the outcomes of an intervention or whether an intervention needs to be adapted in a different setting.10 11 17 Without detailed information about implementation of interventions, it is not possible for clinicians, researchers and policymakers to assess whether interventions were ineffective due to poor design or poor implementation within the trial.7 11
The overall TIDieR scores ranged from 8 to 17 out of 24. Items 1 (the name or a phrase that describes the intervention) and 2 (rationale, theory or goal of the elements essential to the intervention) were the ones mostly reported by those 10 trials. Item 8 (ie, number of times the intervention was delivered and over what period including the number of sessions, their schedule, and their duration, intensity or dose) was partially reported by nine studies; and only two trial presented partial information regarding whether interventions were modified. Those results have significant implications on how we interpret recommendations from those previous reviews, including the Cochrane Review. Given the limited reporting on those trials, clinicians and researchers should take recommendations from previous reviews carefully, given we do not know which and how interventions were delivered within those 10 trials.
Our findings demonstrated that the overall fidelity score ranged from 9% to 56%, with study design being the domain with highest score. Previous reviews assessing fidelity of trials in behavioural change also found study design domain to receive the highest fidelity score.17 25 29 In our review, training of providers and treatment enactment were the two domains with the lowest fidelity scores. Previous reviews also found training of providers to receive the lowest fidelity scores.17 25 29 The lack of reporting of monitoring and implementation of interventions needs to be taken into account when assessing the effectiveness of these interventions.
A number of multimodal interventions were tested within this selection of trials, and included many potential active elements, as for example: muscle strengthening (shoulder, thoracic and cervical muscles), active and passive range of motion, stretching, manual therapy interventions (eg, soft tissue mobilisation, joint mobilisation), scapular retraining exercises, corticosteroid injections and so on. Among a large number of elements within an intervention, it is reasonable to expect some elements to have larger effect on clinical outcomes. It is unclear what are considered the key elements within those multimodal interventions and whether those key elements were delivered or modified during the trial. In addition, there are other elements that may not have been explicitly reported or captured during the trial, but are possible active ingredients of an intervention, as for example, advice, reassurance, education about the condition and interpersonal manners.30 31 These are highly valued by patients, influence their perception of quality of care received31 32 and potentially impact on clinical outcomes (eg, pain).33
It is difficult to define, develop, document and reproduce complex interventions.34 Any intervention is, to some degree, complex and the complexities may arise due to different factors: (1) the intervention itself can be complex (ie, numerous elements that interact with each other impacting on the effect of that intervention); (2) the implementation may be complex (ie, the way the intervention is implemented may impact on the effect of that intervention); (3) the context may be complex (ie, the characteristics of the context in which an intervention is delivered may impact on the effect of that intervention); and (4) the participants may be complex (ie, individual characteristics of participants may impact on the effect of that intervention).7 8 Given all these challenges, it is accepted that it is difficult to maintain high treatment fidelity when delivering complex interventions.17
This study has limitations. The TIDieR and NIHBCC checklists were used to gather information about monitoring and implementation of interventions within a selection of trials, but there are limitations to their use in this review. Some items from the NIHBCC checklists were considered as ‘not applicable’ for certain trials and items received equal weighting when calculating the overall fidelity score. Depending on the conceptual framework used to develop the interventions tested, some elements of the intervention should be more relevant than others to promote changes in clinical outcomes. To gather a deeper understanding about how and whether interventions were implemented as planned, the active elements of an intervention need to be explicitly stated and should have a larger weight on the fidelity score. Our analysis did not take that into consideration. We analysed studies published between 1997 and 2014, and the TIDieR and NIHBCC checklists were published in 2014 and 2011, respectively.16 17 Hence, it is expected that some trials may not provide sufficient information for some items or domains within those checklists. Despite that, the strengths of our findings show how limited information is available regarding interventions tested. Without detailed information about how interventions were monitored and implemented within trials, it is difficult to determine whether interventions did not achieve the expected outcome due to: (1) poor adherence by participants; (2) inadequate delivery by the clinicians; or (3) an ineffective intervention by design.
Findings from this study revealed that most trials did not report sufficient information about how interventions were implemented. This makes it difficult for researchers and clinicians to assess whether the effect of interventions on clinical outcomes were biased due to poor adherence by participants, poor treatment fidelity or whether they are conceptually ineffective. Those trials were included in previous systematic or umbrella reviews. When analysing the recommendations from those reviews, one should take into account the limited information regarding how those interventions were delivered within the trials. Findings from our study highlight the need for interpreting findings from previous systematic reviews with caution.
The authors thank the librarians at the Canterbury Medical Library, University of Otago for their valuable advice.
Contributors DCR conceived the research question. DCR was responsible for the design of the review and is the guarantor. KS, LT, KL, MW, SN and DCR collected data. KS, LT, KL, MW, SN, SEL and DCR contributed to data analysis and data interpretation, and revised the manuscript for important content and approved the final version.
Funding The research was conducted during tenure of the Sir Charles Hercus Health Research Fellowship of the Health Research Council of New Zealand (18/111).
Competing interests None declared.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Patient consent for publication Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement The dataset used and analysed for this review will be available from the corresponding author upon a reasonable request.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.