Introduction Scientific progress and translation of evidence into practice is impeded by poorly described interventions. The Template for Intervention Description and Replication (TIDieR) was developed to specify the minimal intervention elements that should be reported.
Objectives (1) To assess the extent to which outpatient pharmacy interventions were adequately reported. (2) To examine the dimension(s) across which reporting quality varies. (3) To examine trial characteristics that predict better reporting.
Methods The sample comprised 86 randomised controlled trials identified in a Cochrane review of the effectiveness of pharmacist interventions on patient health outcomes. Duplicate, independent application of a modified 15-item TIDieR checklist was undertaken to assess the intervention reporting. The reporting/non-reporting of TIDieR items was analysed with principal component analysis to evaluate the dimensionality of reporting quality and regression analyses to assess predictors of reporting quality
Results In total, 422 (40%) TIDieR items were fully reported, 395 (38%) were partially reported and 231 (22%) were not reported. A further 242 items were deemed not applicable to the specific trials. Reporting quality loaded on one component which accounted for 26% of the variance in TIDieR scores. More recent trials reported a slightly greater number of TIDieR items (0.07 (95% CI 0.02 to 0.13) additional TIDieR items per year of publication). Trials reported an 0.09 (95% CI 0.04 to 0.14) additional TIDieR items per unit increase in impact factor (IF) of the journal in which the main report was published.
Conclusions Most trials lacked adequate intervention reporting. This diminished the applied and scientific value of their research. The standard of intervention reporting is, however, gradually increasing and appears somewhat better in journals with higher IFs. The use of the TIDieR checklist to improve reporting could enhance the utility and replicability of trials, and reduce research waste.
- clinical trials as topic/standards
- reproducibility of results
- research report/standards
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
- clinical trials as topic/standards
- reproducibility of results
- research report/standards
Strengths and limitations of this study
We examined reporting quality in 86 trials of pharmacy interventions using the Template for Intervention Description and Replication (TIDieR).
Multiple regression on TIDieR scores illuminated the predictors of good intervention reporting.
A principal component analysis was used to explore the dimensionality of the TIDieR.
Suboptimal inter-rater reliability of TIDieR assessments suggest some subjectivity in our assessments.
We made various assumptions (eg, journal impact factors are stable over time) to create predictor variables for the regression analyses.
Effective and efficient healthcare depends on trials in which the benefits and harms of interventions are experimentally assessed.1 Underspecification of interventions hampers both implementation of evidence-based practice in healthcare settings and scientific progress.2–5 While additional detail and clarity could be derived, for example, by contacting study authors, this places additional and unnecessary burden on reviewers, researchers and other users of this information.2 6
If interventions are underspecified, methodological decisions and results can be difficult to understand, evaluate and synthesise. Likewise, replication of interventions will be impossible if basic intervention characteristics like, for example, the frequency of interaction between healthcare worker and patient, is not presented. Without clear descriptions of interventions, the similarities and differences between interventions will be obscured and this will hinder research synthesis in systematic reviews and meta-analyses. Thus, inadequate description of interventions is an important potential source of waste within biomedical research.7
What is adequate reporting of an intervention? Hoffmann et al3 sought to answer this question using recommended consensus procedures8 including literature searches, a Delphi procedures and a face-to-face consensus meeting. The outcome was a checklist of 12 items to be included in the description of interventions. This Template for Intervention Description and Replication (TIDieR) is included in the Equator portfolio of checklists and guides, which are intended to enhance the reporting of trials and research more broadly.9
TIDieR also provides a basis for evaluating reporting in the published literature. To our knowledge, three studies have evaluated interventions reported in randomised controlled trials (RCTs) using an adapted version of TIDieR.4–6 Abell et al6 found that just 8% (6/74) of exercise-based cardiac rehabilitation intervention reports described the core TIDieR elements deemed to be essential for replication, though this increased to 43% once additional sources were examined and the trial authors were contacted. Jones et al5 reported that 1% (1/100) of perioperative care interventions included all TIDieR items and that, on average, 43% of TIDieR items were omitted. In physiotherapy interventions evaluated by Yamato et al,4 23% (46/200) omitted more than half of TIDieR items.
Other studies of pharmacy interventions have used the Descriptive Elements of Pharmacist Intervention Characterisation Tool (DEPICT10) tool. DEPICT is a checklist developed to identify key components of pharmacy interventions and is intended as both a writing guide for pharmacists seeking describe their interventions in clear and replicable manner, and as a tool for retrospective analysis of interventions in the published literature. Thus, the rationale and utility of TIDieR and DEPICT overlap substantially. Studies using DEPICT to evaluate reporting quality found 59% of chronic kidney disease interventions11 and ‘most’ asthma trials12 were not implementable based on the available intervention descriptions.
The focus of our study was on interventions implemented by pharmacists in outpatient settings. In recent years, the pharmacist’s role has changed substantially in many countries, with a move away from the traditional function of medicine supply to more behavioural/clinical roles. Pharmacists contribute to the safe and effective use of medicines through the delivery of services such as medication review,13 14 adherence support and advice to prescribers as well as enhanced roles in public health.15 To maximise the efficient use of resources, service development should be informed by research evidence and this is reflected in the growing number of RCTs of pharmacist interventions.16 17 However, the value of these RCTs for policy and practice is dependent on the quality of reporting of the trial and complete descriptions of the interventions tested.
The first aim of our study was to evaluate the reporting of intervention descriptions in RCTs of pharmacy interventions using a modified TIDieR checklist. Our use of TIDieR rather than DEPICT enabled comparison with reporting quality in other health domains. A second aim was to examine how TIDieR items covaried, examining the dimensionality of TIDieR using a principal component analysis (PCA). This enabled us to analyse underlying patterns of TIDieR inclusion and exclusion by examining what items tend to co-occur or cluster in intervention descriptions. Such a pattern of covariation between TIDieR items would suggest that groups of individual items reflect underlying dimension(s) of reporting quality. In other words, the PCA enabled us to investigate whether there was a subset of items which trials tend to report generally well or generally poorly. Identifying these dimensions could be useful since different dimensions of reporting quality may have unique and potentially modifiable causes.
The final aim was to explore whether other article and journal characteristics predicted the completeness of intervention description. We aimed to examine whether reporting improved over time or was associated with other measures of quality, namely risk of bias (RoB). We explored the relationship between reporting quality and journal prestige, measured by the impact factor (IF). Although higher IF journals are often expected to publish ‘better’ science, it is unclear if the reporting quality is superior. We examined if trial size (ie, number of participants) was associated with completeness of intervention reporting. We also examined if reporting space predicts clearer reporting by examining (1) if trials described in multiple manuscripts reported interventions more clearly and (2) if reports published in journals with higher word limits reported interventions more clearly. The space-related predictor variables were added following reviewer recommendations. The other predictors were agreed by the author team before the results were known.
A protocol for the study has not been published elsewhere.
Trial report selection
Eighty-six published trial reports (online additional file 1A) were identified in an interim update of a Cochrane review of non-dispensing outpatient pharmacy services17 and provided the data source for our study. Non-dispensing interventions aim to improve patient’s medication use (through, eg, education) or practitioner prescribing (through, eg, medication reviews). These trials were published between 1979 and 2015, inclusive, and the median year of publication was 2010. Sixty-six of the trials precede 2014, the publication year of TIDieR checklist.3 The Cochrane review included RCTs which evaluated interventions to improve patient health in non-hospitalised patients through the use or cessation of medication and which were led or primarily delivered by a pharmacist. The search terms used to identify these trials are included in online additional file 1B (see also de Barra et al17) and a Preferred Reporting Items for Systematic Reviews and Meta-Analyses checklist is available in online additional file 2.
The 12-item TIDieR checklist was adapted by subdividing several items prior to its application in this study. Note that these modifications are intended to facilitate TIDieR’s use for the evaluation of published intervention descriptions and should not be construed as an attempt to modify of TIDieR more generally. One item, ‘why: rationale of the intervention’, was split into two separate items. First, a behavioural rationale item was used to assess whether the authors presented any rationale or theory to justify behavioural components. For example, the introduction of a daily pill box might be justified by referring to studies which show that forgetfulness lowers adherence. Second, a clinical rationale item was used to assess if the pharmacological component of the intervention had been justified. The checklist item ‘who: provider training/experience’ was specified using three items: intervention-specific training described; qualification of provider; and experience of provider. Definitions were created to specify when all items should be coded as ‘included’, ‘partially included’ or ‘not included’. Thus, the 12-item checklist was developed into a 15-item evaluation tool (box 1; scoring criteria provided in online additional file 1C). This tool was applied by two independent coders to assess the reporting of the interventions for the 86 trials.
TIDieR checklist adapted for report evaluation
Adapted TIDieR items
1. Brief name. Provides the name or a phrase that describes the intervention.
2a. Why (clinical). Describes clinical rationale, theory or goal of the elements essential to the intervention.
2b. Why (behavioural). Describe behavioural rationale, theory or goal of the elements essential to the intervention theory, or goal of the elements essential to the intervention.
3. What (materials). Describes any physical or informational materials used in the intervention.
4. What (procedures). Describe each of the procedures, activities and/or processes used in the intervention, including any enabling or support activities.
5a. Who (expertise). For each category of intervention provider, describes their expertise.
5b. Who (qualifications). For each category of intervention provider, describe their background.
5c. Who (training). For each category of intervention provider, describe specific training given.
6. How (delivery mode). Describe the modes of delivery (such as face-to-face or by some other mechanism, such as internet or telephone) of the intervention and whether it was provided individually or in a group.
7. Where (locations). Describe the type(s) of location(s) where the intervention occurred, including any necessary infrastructure or relevant features.
8. When and how much. Describe the number of times the intervention was delivered and over what period of time including the number of sessions, their schedule, and their duration, intensity or dose.
9. Tailoring. If the intervention was planned to be personalised, titrated or adapted, then describe what, why, when and how.
10. Modifications. If the intervention was modified during the course of the trial, describe the changes (what, why, when and how).
11. How well (planned). If intervention adherence or fidelity was assessed, describe how and by whom, and if any strategies were used to maintain or improve fidelity, describe them.
12. How well (actual). If intervention adherence or fidelity was assessed, describe the extent to which the intervention was delivered as planned.
See online additional file 1C for full scoring criteria.
TIDieR, Template for Intervention Description and Replication.
If reports made no mention of modification or fidelity/adherence assessment (items 10, 11 and 12 in Box 1), we assumed that these reports described trials without modification or adherence assessment and coded the relevant item as ‘non-applicable’. These items were also excluded from composite scores (see Predicting TIDieR reporting rate below). It should be noted that TIDieR checklist does not require authors to report on such modifications/adherence assessment if none occur.
The 15-item evaluation tool was iteratively developed by three authors and piloted on five papers. Once the tool was finalised, these papers were re-evaluated. Disagreements were resolved through discussion between the two coders and, where necessary, consultation with a third coder. The inter-rater reliability of TIDieR coding was examined using Cohen’s kappa with squared weights.18 TIDieR item presence and absence rates are presented descriptively.
Dimensionality of TIDieR
To investigate the dimensions of variability in TIDieR, a PCA19 was conducted on a Spearman’s correlation matrix of associations between TIDieR items. PCAs assess if different linear combinations of TIDieR variables summarise reporting quality using fewer variables. The number of components was determined using a parallel analysis.20 No rotation was performed.
Predicting TIDieR reporting rate
To generate composite scores for the multivariate analysis, we summed the number of items reported and, separately, the number of items not reported. This avoided the assumption of equidistance between included, partially included and not included. Another composite score was generated from the PCA results. Items that loaded on only one component at 0.6 or above (a threshold of ‘practical significance’21) were identified. We then created a weighted summed score by adding the weights of TIDieR items which met this criterion. Items were weighted as follows: not included=0, partly included=1 and included=2. While this does assume equidistance, the psychometric/variable-reduction approach outlined here enables the identification and measurement of different dimensions of reporting quality.
Journal IF was derived from Journal Citation Reports via the Web of Science citation indexing service.22 As in Dechartes et al,23 estimates were taken from the year 2016 rather than the year of publication. Ten IFs were unavailable because the journal had been discontinued. We therefore assumed that these journals were likely to have low IF and hence we imputed these 10 missing values with the minimum IF of the extant journals included. The analyses were repeated excluding these 10 to test the robustness of findings.
Similarly, contemporary journal word limits rather than word limits at publication year were used as these could not feasibly be accessed. Where journals had no word limit used the maximum word limit found in the journals with a limit (10 000 words). Where journal word limit was impossible to determine, we used the median limit (3500 words). Where journals had multiple word limits (eg, short article vs main article), we used the limit for the main empirical articles.
RoB was estimated using the Cochrane RoB tool which evaluates bias on seven criteria.24 Two authors independently applied the measure and resolved discrepancies through discussion with a third author. Unclear RoB and high RoB scores are the number of criteria which were graded as unclear RoB or high RoB, respectively (possible range for both variables: 0–7). For further information on RoB in these trials, see the associated Cochrane Review.17
A multivariate regression was performed on each of the dependent variables (number of TIDieR items reported, number of TIDieR items not reported, PCA score(s)). Given that items 10–12 in box 1 were typically non-applicable, these were excluded from calculation of the ‘TIDieR items reported’ and ‘TIDieR items not reported’ dependent variables. The following independent variables were included: (1) year of publication, to test for trend over time; (2) total sample size at baseline, to examine whether more complete reporting occurred with larger sample size; (3) IF, to examine whether trials published in more prestigious journals were better reported; (4) high RoB, to assess whether trials with other deficits were more likely to omit TIDieR items, (5) number of manuscripts, to assess if trials described in multiple manuscripts were better reported, and (6) word limit in publication journal, to assess if journal limits were associated with poorer reporting.
Correlation between different types of poor reporting
Additional correlation analyses focused on the relationship between TIDieR items included/excluded and unclear RoB. These were analysed in a correlation analysis rather than the regression equations because unclear RoB can also be considered to be an additional measure of reporting quality rather than a possible cause of poor reporting.
Patient and public involvement
We assumed that patients and public favour complete reporting of interventions since such reporting patterns may both increase the quality of healthcare and decrease research waste and costs. Thus, there was no patient or public involvement in this study.
Inter-rater reliability of TIDieR scoring
Cohen’s kappa with squared weights,18 which takes into account the ordinal nature of the data was 0.5, which represents a fair/moderate level of agreement.25 When items scored as non-applicable were included as an additional category, kappa increased to 0.73. The differences between coders typically lay in deciding if items were reported versus partially reported, or between not reported and partially reported.
There were 1290 items that could have been reported within this review (86 trials and 15 items/trial). Of these, 422 (40%) were reported, 395 (38%) were partially reported and 231 (22%) items were not reported. The remaining 242 items were scored as non-applicable. No trials fully reported all 15 items.
The mean number of TIDieR items included in each trial was 4.83 (SD: 1.92), possible range 0 to 12This mean score excludes items 10–12 which were typically rated as non-applicable. The mean number of TIDieR items not reported was 2.65 (SD: 1.55). As figure 1 illustrates, substantial differences in reporting frequency occurred between items as well as between trials.
Dimensionality of TIDieR
A Spearman’s correlation plot (online additional file 1D) indicated some covariation in TIDieR items. Bartlett’s test of sphericity indicated that the data were suitable for PCA (χ2 (66)=165.77, p<0.001) and the Kaiser-Meyer-Olkin test26 suggested reasonable (0.75) sampling adequacy. A parallel analysis, executed with the paran package in R, indicated that one factor should be extracted. A scree plot, which includes the random data-derived eigenvalues produced by the parallel analysis, can be seen in online additional file 1D. The first four eigenvalues were 3.1, 1.32, 1.25 and 1.14. Table 1 shows the factor loadings. This component accounted for 26% of the variance in TIDieR scores.
Scores (not included=0, partially=1, included=2) on the four items with loadings of 0.6 or higher were summed to produce a PCA-derived dependent variable with a Cronbach’s alpha of 0.72.
Predicting TIDieR reporting rate
In all regression analyses, residual plots (linearity and homoscedasticity), QQ plots (normal errors) and the Durbin-Waton test (correlated errors) indicated that the assumptions of linear regression analyses were met; see online additional file 1E. Note that regression analyses which excluded the 10 reports with imputed IFs, show similar results.
The multiple regression analyses indicated an increase of 0.07 (95% CI 0.02 to 0.13) TIDieR items reported per year. For each additional unit of IF, 0.09 (95% CI 0.04 to 0.14) extra TIDieR items were included. The pattern of association with TIDIeR items not included and with PCA component scores was broadly similar, see table 2. The regression predicting the PCA component indicated that larger trials may be more poorly reported (beta=−0.08, 95% CI −0.14 to 0.02), though this pattern was not observed in the other two regressions. High RoB was not associated with TIDieR reporting.
Correlation between different kinds of poor reporting
Trials with fewer uncertain RoB evaluations reported more TIDieR items (r=−0.24, p= 0.03) and a higher TIDieR component score (r=−0.23, p= 0.03). The relationship between uncertain RoB and TIDieR items not included was in the predicted direction but not statistically significant (r= 0.15, p= 0.15).
These results indicate (1) that more than half of TIDieR items were not fully included in pharmacy trial reports, (2) that variability in TIDieR reporting can be captured by a single component and (3) that slight improvements are being achieved over time, and that trials with more complete reporting are more likely to be published in higher impact journals. Number of manuscripts or journal word limits did not predict reporting quality.
Frequency of reporting of TIDieR items
It is of concern that few intervention procedures were described with ‘sufficient detail for replication’ (a criterion for item 4, what: procedures). The interventions were complex and involved varied interactions with patients, yet the description of these interventions was typically brief, often comprising only a single paragraph or a few sentences.27–30
Procedural ambiguity was also apparent in low scores for item 9, tailoring. Interventions were typically tailored to the clinical, knowledge or behavioural situation of the patient, yet in most cases the nature of this tailoring was not made explicit as an ‘if-then’ rule. Rather, this was left to the judgement and clinical skills of the pharmacist. While this is likely to reflect daily practice, ambiguity could be attenuated by being explicit about the background, experience and training—including training evaluation—of those employing their clinical judgement. Such information would enable the reader to understand what clinical competencies are necessary to implement these behaviour change interventions in a comparable fashion.31 However, with few exceptions,32–34 trial reports were not fully explicit about the experience, qualifications and training of personnel.
The majority of trials included some mention of materials but a detailed description of the package of, for example, questionnaires or educational booklets supplied to each pharmacy was typically lacking. While the frequency of interventions was generally described, the duration was often not made explicit.35–38 In their evaluation of cardiac rehabilitation interventions, Abell et al6 also found that session frequency was reported more often than session duration. The fidelity at which the intervention was delivered was rarely mentioned, perhaps because this was rarely assessed. Although not the focus of this study, attention to the fidelity of intervention delivery may be important in explaining variability in effect size. In contrast to the clinical rationale, the behavioural rationale was reported less frequently. This requires a reason or theory for the selection of the proposed intervention and frameworks such as COM-B (capability, opportunity, motivation and behaviour)39 or the theoretical domain framework40 might be useful for future justification and reporting. In addition, the links between the proposed theoretical mechanism and the selected intervention could be specified.
Component structure of TIDieR
In the present analysis, variation in quality of reporting was captured by a single dimension and this dimension accounted for approximately one-quarter of the variance in TIDieR scores. Items loading heavily on this item were what, how and why (behavioural) items. That is to say, trials tended to report these items generally well or generally poorly. For reasons unclear to us, variance in reporting who executed the intervention was less well captured by this component. Future research on the dimensional structure of TIDieR may benefit from larger sample sizes. With this in mind, the dataset associated with this study been placed online.41
What predicts reporting quality?
Trial reports published in higher impact journals tended to have better reporting quality. This trend is consistent with a recent analysis of unclear RoB evaluations in 20 920 RCTs.23 Similarly, other studies have found a more general association between IF and methodological quality.42 43
Our results suggest an improvement in reporting quality over time, a result consistent with other studies.23 44 The rate of improvement is, however, very slow (0.8 additional items TIDieR items included per decade); time will tell if publication of TIDieR checklist, as well as evaluations similar to our study, will enhance the pace of improvement. The positive correlation between included TIDieR items and the number of unclear RoB evaluations suggests that trial reports with thorough intervention descriptions also tend to have well-described methodologies and points to the convergent validity of our reporting of quality measures.
Improving reporting quality
These results add to growing literature on the limits of intervention descriptions in the pharmacy literature10–12 45 and in other biomedical fields.3 We now suggest some practical steps that might be taken to improve the quality of reporting.
First, we suggest that trial authors use TIDieR and DEPICT checklists when designing, planning and reporting their intervention. Often, this projects data extractor’s initial impression that a report was thoroughly written was disproved once the checklist had been applied. Checklists simplify the writing process and prevent errors, much as checklists have done in other medical and non-medical domains.46
While we found no evidence that word limit/papers-per-trial predict reporting quality, it would be hasty to conclude these limits are irrlevant. If word limits are prohibitive, appendices or additional online materials should be considered, although the longevity of such resources has been questioned. Hoffmann et al47 found that several trials had placed materials online but the resources had not been maintained and had become inaccessible. Services such as Figshare48 and the Open Science Foundation49 enable materials—including video/audio files—to be shared and cited.
There is evidence to suggest that the use of checklists during peer review enhances reporting quality.50 The quality of reporting is likely to increase if reviewers assess reports using the checklist and/or if authors are required to state that they have complied with checklist recommendations. There may also be a role for journal editors in making TIDieR or DEPICT checklist a criterion for evaluation of manuscripts. Indeed, evidence from RCTs suggests that introducing guidelines to evaluate papers increases reporting quality.50 Editors and publishers may also facilitate improvements by either excluding methods sections from the article word counts or by facilitating the dissemination of intervention descriptions in accessible appendices.
Agreement between the coders was less than ideal; the coders sometimes found it difficult to identify rules that unambiguously distinguished included from partly included or not included. In their evaluation of physiotherapy interventions, Yamato et al4 similarly found agreement for many TIDieR items was suboptimal.
Unlike earlier studies4 6 but as in Jones et al5 we opted not to code control group ‘interventions’. Studies have demonstrated that intervention effects are a function of what happens in the control group, and thus interpretation of effect size depends on understanding what happens in control groups.51 Nevertheless, our study focused on reporting of interventions and thus the focus was solely on intervention groups.
We used contemporary journal word limits and IFs rather than limits/IFs at the date of manuscript publication. We also imputed missing IFs and word limits. These decisions probably increase the chance of underestimating effect sizes.
Most pharmacy trials reviewed here lacked adequate intervention reporting. This diminished the applied and scientific value of the research and may stymie improvments in patient health. The standard of intervention reporting is, however, gradually increasing and appears somewhat better in journals with higher IFs. The use of TIDieR checklist to improve reporting could enhance the utility and replicability of trials, and reduce research waste.
Contributors The idea of evaluating the trial reports using TIDieR comes from MJ; all authors contributed to the design. MidB and CS extracted the data, with support from MJ and MW. MidB performed the analysis with suggestions from MDB, MJ, NS, CB and CM. MidB wrote the manuscript with comments from all authors.
Funding Funding was provided by the Chief Scientist Office, grant number CZH/4/1041. MidB was also funded by the Professor Roy Weir Career Development Fellowship. MW was funded by a Health Foundation Improvement Science Fellowship.
Competing interests None declared.
Patient consent for publication Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data are available in a public, open access repository. Here is the Open Science Foundation link for the dataset: https://osf.io/a9mpw/
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.