Article Text
Abstract
Objectives Comprehensive protocols are key for the planning and conduct of randomised clinical trials (RCTs). Evidence of low reporting quality of RCT protocols led to the publication of the Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) checklist in 2013. We aimed to examine the quality of reporting of RCT protocols from three countries before and after the publication of the SPIRIT checklist.
Design Repeated cross sectional study.
Setting Swiss, German and Canadian research ethics committees (RECs).
Participants RCT protocols approved by RECs in 2012 (n=257) and 2016 (n=292).
Primary and secondary outcome measures The primary outcomes were the proportion of reported SPIRIT items per protocol and the proportion of trial protocols reporting individual SPIRIT items. We compared these outcomes in protocols approved in 2012 and 2016, and built regression models to explore factors associated with adherence to SPIRIT. For each protocol, we also extracted information on general trial characteristics and assessed whether individual SPIRIT items were reported
Results The median proportion of reported SPIRIT items among RCT protocols showed a non-significant increase from 72% (IQR, 63%–79%) in 2012 to 77% (IQR, 68%–82%) in 2016. However, in a preplanned subgroup analysis, we detected a significant improvement in investigator-sponsored protocols: the median proportion increased from 64% (IQR, 55%–72%) in 2012 to 76% (IQR, 64%–83%) in 2016, while for industry-sponsored protocols median adherence was 77% (IQR 72%–80%) for both years. The following trial characteristics were independently associated with lower adherence to SPIRIT: single-centre trial, no support from a clinical trials unit or contract research organisation, and investigator-sponsorship.
Conclusions In 2012, industry-sponsored RCT protocols were reported more comprehensively than investigator-sponsored protocols. After publication of the SPIRIT checklist, investigator-sponsored protocols improved to the level of industry-sponsored protocols, which did not improve.
- EPIDEMIOLOGY
- Clinical trials
- Protocols & guidelines
Data availability statement
Data are available upon reasonable request. Data underlying this article will be shared on reasonable request to the corresponding author.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
We had full access to randomised clinical trial protocols from all research ethics committees in Switzerland and a convenience sample of one ethics committee in Germany and one in Canada approved in 2012 and 2016.
The sample of trial protocols from Switzerland (n=397) was much larger than the sample from Germany (n=75) or Canada (n=77).
The results from multivariable beta regression and logistic regression models were robust in sensitivity analyses using methods outlined a priori in a previously published protocol.
All analyses were observational and any causal effect of the published Standard Protocol Items: Recommendations for Interventional Trials checklist cannot be inferred.
Included trial protocols came all from three high-income countries limiting the generalisability of the results.
Introduction
Randomised clinical trials (RCTs) are directed by their protocol, which documents the rationale, design and planned reporting of a trial.1 Funding agencies, research ethics committees (RECs), regulatory agencies, medical journals, systematic reviewers and other groups rely on protocols to appraise the quality of a proposed trial.2 With incomplete protocols reviewers typically cannot distinguish between the use of inappropriate methodology and the non-reporting of appropriate methodology. In addition, if details about the application of the trial intervention or situations with unblinding of trial participants are lacking, the resulting uncertainty with treating clinicians may compromise the safety of trial participants. Empirical evidence from meta-research suggested numerous limitations in the reporting of RCT protocols including insufficient descriptions of treatment allocation methods, primary outcomes, sample size calculations, data analysis and the roles of sponsors in trial design or access to data.3–9 About half of protocols approved by French RECs, for instance, were estimated to have subsequent amendments to address deficiencies,10 and a third of amendments submitted to RECs for industry-sponsored trial protocols could have been avoided by preparing more comprehensive protocols.11 12
In response, a minimum set of items to be addressed in trial protocols was developed by the Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) Initiative, and published in January 2013.13 14 Subsequently, a number of journals publishing trial protocols, funding agencies and RECs endorsed the use of SPIRIT or related recommendations (eg, www.swissethics.ch).15 Researchers have applied the SPIRIT checklist to assess the quality of trial protocols with respect to patient reported outcomes,16 statistical analyses17 and cluster-randomised trials with stepped wedge design.18 However, there is no large-scale empirical study that has longitudinally evaluated the impact of the SPIRIT recommendations on the quality of reporting among RCT protocols.
The Adherence to SPIrit REcommendations (ASPIRE) study group is an international collaboration of researchers with a mandate to (a) evaluate the completeness of RCT protocols before and after publication of the SPIRIT statement, (b) determine trial characteristics associated with non-adherence to SPIRIT checklist items and (c) investigate whether the comprehensiveness of RCT protocols varies across countries.19 In the present paper we report the results from our investigation of RCT protocols from Switzerland, CAnada and GErmany (ASPIRE-SCAGE).
Methods
The methods used to conduct the present study have previously been published.19
Identification of included trial protocols
We included trial protocols approved by RECs in 2012 or 2016 that assigned patients or groups of patients at random to one or more interventions to evaluate their effect on health outcomes. We excluded RCTs enrolling healthy volunteers, economic evaluations, animal studies, studies based on tissue samples, observational studies, studies involving only qualitative methods, and studies with a quasi-random method of allocation. Details of the identification of included RCT protocols are presented in online supplemental figure 1. The eligibility of RCT protocols was assessed independently and in duplicate. Any disagreements were resolved by discussion and consensus.
Supplemental material
Data extraction
We used a web-based, password protected data extraction tool (http://squiekero.ch) for data collection and storage.19 20 Researchers trained in trial methodology completed a calibration process to improve reliability, and then extracted relevant data from RCT protocols independently and in duplicate, including whether individual SPIRIT items were reported.19 Disagreements were resolved by discussion. Due to limited resources 15% of included protocols were extracted by a single researcher (having extracted at least 100 RCT protocols in duplicate). All researchers extracting data from RCT protocols signed confidentiality agreements and the final database contained only coded data. Our data extraction forms are provided as online supplemental table 1.
Data analysis
The outcomes of interest were the proportion of SPIRIT checklist items that were reported among our cohorts of study protocols, and the proportion of RCT protocols addressing each SPIRIT checklist item. Our primary analysis was based on a rating approach that allowed for partial credit of individually met sub-items or components of major SPIRIT items, because it keeps the hierarchical structure of the SPIRIT checklist and it independently considers all components and sub-items of all individual SPIRIT items.19 Other rating approaches that consider major SPIRIT items only or equally consider items and sub-items, were used in sensitivity analyses. We provided descriptive statistics as frequencies and proportions for binary data and median (IQR) for continuous data.
To investigate whether the reporting quality of RCT protocols (as defined by the proportion of reported SPIRIT checklist items) has increased from 2012 to 2016, we conducted multivariable beta regression analysis21 with the proportion of SPIRIT items adhered to per protocol as dependent variable and the following predefined independent variables: (a) approval year (2012 vs 2016), (b) investigator sponsorship versus industry sponsorship, (c) planned sample size (in increments of 1000), (d) single centre versus multicentre RCTs and (e) reported methodological support from a contract research organisation (CRO) or clinical trials unit (CTU) versus no reported support. We included interaction terms in our model to investigate potential interactions of year of approval (2012 or 2016) with either sponsorship of protocols or reported methodological support. We performed a likelihood ratio test to check if the interaction terms improved the goodness of fit of the models. To examine in a sensitivity analysis whether the comprehensiveness of RCT protocols varied across countries we stratified the median proportion of addressed SPIRIT items per protocol by country (Switzerland, Canada, Germany), by year of approval (2012 vs 2016), and by sponsorship (investigator vs industry), and added a country variable to the regression model. In further sensitivity analyses, we used hierarchical logistic regression (response is a binary variable indicating adherence to each SPIRIT item with clustering by protocol; that is, independent variables were included in the model as fixed effects and the protocol as a random effect) instead of beta regression.19 Beta regression allowed us to directly model the proportion of SPIRIT items adhered to per protocol,21 while hierarchical logistic regression allowed us to capture the variability within protocols. For all types of regression analyses we reported coefficients or ORs accompanied by 95% CIs. We used the statistical software R V.3.6.1 for all data analysis. All statistical tests were performed using a significance level of p=0.05.
Patient and public involvement
No patients were involved in the study.
Results
Characteristics of included trial protocols
We included 549 RCT protocols in our study; 257 from 2012 and 292 from 2016 (table 1). The majority of which were individually randomised, multicentre, parallel-group, superiority trials in oncology or cardiovascular medicine, and approved by a Swiss REC. Seventy-seven RCT protocols were from Canada and 75 from Germany. About half of the protocols were investigator-sponsored and half were industry-sponsored. In 2016 there were more investigator-sponsored protocols (162/292, 55.5%) included than in 2012 (119/257, 46.3%). In 2016 the median planned sample size was lower (199; IQR, 100–490) than in 2012 (300; IQR, 100–720). Otherwise, trial characteristics were similar between cohorts. Protocols of industry-sponsored RCTs had, on average, a larger sample size, were predominantly multinational, and more frequently placebo-controlled than those of investigator-sponsored RCTs (table 1).
Adherence to SPIRIT in protocols from 2012 and 2016
Overall, the median proportion of reported SPIRIT items increased from 72% (IQR, 63%–79%) in 2012 to 77% (IQR, 68%–82%) in 2016 (table 2, figure 1).
Stratifying by sponsorship, we found that the comprehensiveness increased only in investigator-sponsored RCT protocols (adherence stratified by other study characteristics can be found in online supplemental table 2). The median proportion of reported SPIRIT items in investigator-sponsored protocols increased from 64% (IQR, 55%–72%) in 2012 to 76% (IQR, 64%–83%) in 2016, while it remained unchanged at 77% for both years among industry-sponsored protocols (77%, IQR 72%–80% in 2012, and 77%, IQR 72%–82% in 2016). This pattern was consistent across countries (online supplemental table 3). Sensitivity analyses using different approaches to calculate the proportion of reported SPIRIT items provided similar results (online supplemental table 4).
Regarding individual SPIRIT items, we found that the improvement in investigator-sponsored RCT protocols was due to an improvement in a broad range of SPIRIT items (online supplemental table 5); for 25 individual items the proportion of adherent protocols improved in investigator-sponsored RCTs by 10% or more (online supplemental table 6). These 25 items included descriptive (eg, information on study registration, protocol version and date, name and contact details of sponsor) as well as methodological aspects (eg, comparator choice explained, or allocation concealment mechanism). The largest improvements occurred with ‘trial registration’(SPIRIT item 2, +41.1%), ‘plans to disseminate trial results to key stakeholders/publication provided’ (SPIRIT item 31a, +36.7%), ‘description of process for making amendments’ (SPIRIT item 25, +34.4%) and ‘declaration of interests’ (SPIRIT item 28, +31.6%). In industry-sponsored protocols, adherence to individual SPIRIT items remained practically stable from 2012 to 2016, that is, items with low adherence in 2012 remained low in 2016. SPIRIT items with particularly low adherence (<30%) in both industry-sponsored and investigator-sponsored protocols were ‘names of protocol contributors/authors’ (SPIRT item 5a), ‘research question described and justified’ (SPIRIT item 6a), ‘eligibility criteria for study centres’ (SPIRIT item 10) in applicable RCTs, ‘location of participant recruitment’ and ‘estimated recruitment rate’ (SPIRIT item 15), ‘information about who will have access to the full dataset’ (SPIRIT item 29) and ‘description of plans for granting access to full trial protocol’ (SPIRIT item 31c) (online supplemental table 5).
Multivariable regression analysis
Using multivariable beta regression, we found that four characteristics were independently associated with greater reporting of SPIRIT items: multicentre RCTs (OR, 1.18; 95% CI 1.05 to 1.33; p=0.006), RCTs with reported methodological support from CTUs or CROs (OR, 1.44; 95% CI 1.31 to 1.57; p<0.001), industry-sponsored RCTs (OR, 1.20; 95% CI 1.09 to 1.33; p<0.001) and RCTs approved in 2016 (OR, 1.33; 95% CI 1.22 to 1.45; p<0.001) (online supplemental table 7, figure 2).
Adding the interaction term of year of approval and sponsorship to the model, improved the model fit (likelihood ratio test, χ2=30.01, p<0.01) and provided evidence for a differential improvement in the adherence to SPIRIT over time (2012 vs 2016) for industry-sponsored and investigator-sponsored protocols suggesting that there was an improvement in investigator-sponsored protocols but not in industry-sponsored protocols (interaction p<0.001). We did not find evidence for an interaction between year of approval and CTU/CRO support (interaction p=0.79), that is, protocols with or without reported support from CTUs or CROs improved to a similar extent from 2012 to 2016. Limiting our multivariable regression to investigator-sponsored protocols in an exploratory analysis, we found a notable interaction suggesting a more pronounced improvement in Swiss protocols compared with protocols from Canada or Germany (interaction p=0.032; online supplemental table 8). Sensitivity analyses using hierarchical logistic regression instead of beta regression confirmed all results.
Discussion
Main findings and interpretation
This study of 549 RCT protocols approved by RECs in Switzerland, Canada and Germany before (2012) and after (2016) the publication of the SPIRIT recommendations suggested a small overall improvement in reporting comprehensiveness. This change was driven by an increase in the median proportion of reported SPIRIT items in investigator-sponsored RCTs from 64% in 2012 to 76% in 2016. Protocols of industry-sponsored RCTs remained, on average, unchanged (median of 77% SPIRIT items reported in both years). The reporting of investigator-sponsored protocols improved for the majority of individual SPIRIT items, and was independent of the planned sample size, reported support from a CTU or CRO, and centre status (singlecentre vs multicentre) of RCTs. Single centre status, no reported support from a CTU or CRO, investigator sponsorship, and approval in 2012 were independently associated with lower adherence to the SPIRIT checklist. These results were similar across countries, but the improvement in investigator-sponsored RCT protocols appeared more pronounced among Swiss protocols compared with protocols approved in Canada or Germany. SPIRIT items with particularly low adherence in investigator-sponsored and industry-sponsored protocols concerned the names of protocol contributors/authors, the justification of the research question, details about the planned participant recruitment, information about who will have access to the full dataset, and plans for granting access to the full trial protocol.
Our findings suggest an international improvement in the comprehensiveness of investigator-sponsored RCT protocols probably due to a combination of reasons including the publication of the SPIRIT checklist and its implementation by research institutions, funding agencies and medical journals; the ongoing discussion about the importance of protocol publication, thoughtful planning of RCTs, and minimising reporting biases in the scientific community; and efforts to teach RCT methodology to clinician scientists in undergraduate and postgraduate courses. The more pronounced improvement of Swiss investigator-sponsored protocols could be related to a SPIRIT-based protocol template and guidance provided by swissethics22 that were particularly welcomed by academic researchers or other changes in the context of the new Swiss legislation on human research from 2014.
Strengths and limitations
Strengths of our study include full access to RCT protocols and associated documents from RECs in three countries. We used standardised methods and procedures for data extraction and protocol assessment at all RECs and involved only trained methodologists in this process. This included use of piloted extraction forms with detailed written instructions as well as calibration exercises with all data extractors. More than 95% of included protocols approved in 2012 and over 80% of protocols approved in 2016 were extracted independently and in duplicate. To minimise chance associations, we considered only a limited number of variables in our statistical models.23 Our results proved robust in sensitivity analyses applying alternative assumptions and statistical approaches. The fact that all Swiss RECs participated in this study strengthens the representativeness of our data for Switzerland and the additional inclusion of a German and a Canadian REC allowed for an international comparison to some extent.
Our study has several limitations. First, we used a convenience sample of two RECs outside of Switzerland (Freiburg in Germany, Hamilton in Canada) but we cannot be certain if they are representative of other RECs in these or other countries. Second, we used RCT protocols that had already been approved by RECs, therefore SPIRIT items such as ‘research ethics approval’ and ‘consent forms provided’ were always fulfilled and could not discriminate more comprehensive from less comprehensive protocols. Third, although we had adequate statistical power to detect even interactions within the subgroup of investigator-sponsored protocols, the number of included protocols approved outside of Switzerland was relatively small (28%; 152/549), limiting the precision of estimates for German and Canadian protocols. Fourth, 15% of included protocols were not evaluated in duplicate which could have increased the risk of bias in our study. However, these protocols were from different RECs in Switzerland and they were handled by one of the two most experienced data extractors only, so we feel that a relevant increase in the risk of bias is unlikely. Fifth, we are not aware of the fact that any of the participating RECs explicitly endorsed SPIRIT guidance, however, in Switzerland a new protocol template provided by swissethics became available which was influenced by SPIRIT impacting the generalisability of our results. In addition, it remains unclear to what extent our findings can be extrapolated to trial protocols from middle-income or low-income countries and to protocols from medical disciplines underrepresented in our sample (eg, dentistry or geriatrics; online supplemental table 9). Finally, our findings are not proof of causality due to the observational nature of this study, however, it is plausible that the publication of the SPIRIT statement at least contributed to an increase in the comprehensiveness of investigator-sponsored protocols. Investigations of a potential time trend with gradually increasing comprehensiveness of investigator-sponsored protocols by year tertiles did not suggest a continuous development, but rather a one-step-effect (online supplemental figure 2).
Comparison with other studies
Few studies in the literature have used16 or planned to use17 18 24 the SPIRIT checklist as a tool to assess the comprehensiveness of trial protocols. One study investigated 75 RCT protocols from the UK National Institute for Health Research Health Technology Assessment programme on the reporting of patient-reported outcomes and the association with general protocol completeness according to SPIRIT.16 They found that these investigator-sponsored UK RCT protocols from 2012 and 2013 reported, on average, 63% of SPIRIT checklist items, which is very similar to our findings for investigator-sponsored RCT protocols from 2012. Apart from the ongoing study using protocols from UK RECs (ASPIRE-UK19), we are not aware of any other study evaluating the comprehensiveness of RCT protocols before and after the publication of the SPIRIT statement in industry-sponsored and investigator-sponsored protocols.
There are studies assessing the quality of RCT protocols using measures other than the SPIRIT checklist. An analysis of drug trial protocols submitted to three Dutch RECs in 2010/2011 focused on critical comments by RECs.25 They found that applicants of investigator-sponsored trials received more critical comments on participant selection, methodology and statistical analysis than applicants of industry-sponsored trials, resonating with our findings of less comprehensive investigator-sponsored protocols compared with industry protocols in 2012. Similarly, studies by Getz et al used the proportion of protocols with substantial amendments as a measure of RCT protocol quality in the industry setting showing that more comprehensive protocols could have prevented amendments.11 12 Finally, a study of 596 published RCT protocols from 2001 to 2011 assessed protocol quality (high vs low) based on reporting of the allocation method, allocation concealment and intention-to-treat analysis.26 This study found a substantial improvement in some methodological aspects of protocols (eg, allocation concealment), but acknowledged the overall low proportion of high-quality protocols with 24% in 2001–2004, 31% in 2005–2008 and 37% in 2008–2011.
Implications
Incomplete protocols may jeopardise the clinical research process at all stages with potentially harmful consequences for patients, decision-makers in healthcare, the scientific community and society as a whole. Whether there is indeed an association between better reported or more comprehensive RCT protocols and better methodology, successful trial conduct and/or publication of RCTs remains to be established. Based on the RCT sample of this study, we will examine the relationship between completeness of RCT protocols and risks for premature discontinuation or non-publication of RCTs as well as potential improvements between 2012 and 2016 in terms of fewer trial discontinuations and non-publications particularly for investigator-sponsored RCTs in subsequent investigations.19
Our results show improvement in the reporting quality of investigator-sponsored trial protocols such that they became consistent with industry protocols. About why industry protocols have not improved according to SPIRIT between 2012 and 2016, we can only speculate. It might be that routines and processes for writing trial protocols have been well established at companies earlier explaining our finding of consistently low adherence to specific SPIRIT items in 2012 and 2016 in industry-sponsored protocols. So, as long as regulators do not make specific protocol templates mandatory for all applicants, industry may not adapt routines and templates according to SPIRIT.
Our finding of insufficient reporting of names of protocol contributors/authors, the justification of the research question, details about the planned participant recruitment, information about who will have access to the full dataset, and plans for granting access to the full trial protocol guides involved stakeholders with respect to further needs for protocol improvement. The identified items constitute important pieces of information to enable a valid assessment of the relevance, feasibility and transparency of planned clinical trials.
Conclusions
This before-and-after study suggests that the comprehensiveness of investigator-sponsored RCT protocols from Switzerland, Canada and Germany improved after publication of the SPIRIT checklist, achieving a similar reporting quality as industry-sponsored protocols. Single centre status, no reported support from a CTU or CRO, investigator sponsorship, and approval in 2012 were independently associated with lower adherence to SPIRIT. Further means are needed to improve the reporting of RCT protocols particularly with respect to protocol authorship, justification of the research question, participant recruitment, access to the full dataset and plans for granting access to the full trial protocol. Future research should clarify the relationship between protocol quality and success of subsequent trial conduct and publication.
Data availability statement
Data are available upon reasonable request. Data underlying this article will be shared on reasonable request to the corresponding author.
Ethics statements
Patient consent for publication
Ethics approval
This study does not involve human participants. The participating RECs in Switzerland (Basel, Bern, Geneva, Lausanne, St. Gallen, Thurgau, Ticino, Zurich), Germany (Freiburg) and Canada (Hamilton) approved this study or explicitly stated that no ethical approval was required.
Acknowledgments
We are grateful to Professor Doug Altman (University of Oxford) who was instrumental in developing the initial concept of the Adherence to SPIrit REcommendations (ASPIRE) study and who sadly died before it came to fruition. We thank all participating Research Ethics Committees from Germany (Freiburg), Switzerland (Basel, Ticino, Bern, Geneva, Lausanne, St. Gallen, Thurgau, Zurich), Canada (Hamilton) and the UK (National Health Service Health Research Authority) for their support and cooperation.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
Twitter @AlAinAmstutz, @JasonWBusse, @LGHemkens
Contributors AO, SH, EVE, BK and MB conceived of the study. EVE and MB acquired funding. RS developed the web-tool for data extractions. DG, BvN, BS and MB coordinated data extraction from protocols. DG, GM and MB developed the data analysis plan and interpreted the data. DG performed the data analysis. MB and DG wrote the first draft of the manuscript. DG, BvN, BS, BK, EO-R, AB, SSchandelmaier, DM, YT, AAmstutz, CP-M, VG, KB, KW, LR, SL, JJM, AN, KK, NG, ATH, JW, NC, PH, KAM, SSricharoenchai, JWB, AAgarwal, MS, LH, SH, KW, EVE and MB were involved in data collection and critically revised the manuscript. All authors approved the final version before submission. DG and MB act as guarantors.
Funding This work was supported by the Swiss Federal Office of Public Health. The funder has no role in the study design, data collection and analysis, decision to publish or preparation of this manuscript. BS is supported by an Advanced Postdoc.Mobility grant from the Swiss National Science Foundation (P300PB_177933). SL participates in this project during her research stay at the Institute for Evidence in Medicine, University of Freiburg, supported by the Alexander von Humboldt Foundation, Germany.
Competing interests BvN is currently employed by Roche Pharma AG, Grenzach-Wyhlen, Germany. BK is currently employed by iOMEDICO AG, Freiburg, Germany. All other authors declare no financial relationships with any organisation that might have an interest in the submitted work and no other relationships or activities that could appear to have influenced the submitted work.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.