Article Text

Download PDFPDF

Reporting of planned statistical methods in published surgical randomised trial protocols: a protocol for a methodological systematic review
  1. Kim Madden1,
  2. Erika Arseneau1,
  3. Nathan Evaniew2,
  4. Christopher S Smith3,
  5. Lehana Thabane1
  1. 1Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, Canada
  2. 2Division of Orthopaedic Surgery, McMaster University, Hamilton, Ontario, Canada
  3. 3OrthoEvidence Inc., Burlington, Ontario, Canada
  1. Correspondence to Kim Madden; maddenk{at}


Introduction Poor reporting can lead to inadequate presentation of data, confusion regarding research methodology used, selective reporting of results, and other misinformation regarding health research. One of the most recent attempts to improve quality of reporting comes from the Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) Group, which makes recommendations for the reporting of protocols. In this report, we present a protocol for a systematic review of published surgical randomised controlled trial (RCT) protocols, with the purpose of assessing the reporting quality and completeness of the statistical aspects.

Methods We will include all published protocols of randomised trials that investigate surgical interventions. We will search MEDLINE, EMBASE, and CENTRAL for relevant studies. Author pairs will independently review all titles, abstracts, and full texts identified by the literature search, and extract data using a structured data extraction form. We will extract the following: year of publication, country, sample size, description of study population, description of intervention and control, primary outcome, important methodological qualities, and quality of reporting of planned statistical methods based on the SPIRIT guidelines.

Ethics and dissemination The results of this review will demonstrate the quality of statistical reporting of published surgical RCT protocols. This knowledge will inform recommendations to surgeons, researchers, journal editors and peer reviewers, and other knowledge users that focus on common deficiencies in reporting and how to rectify them. Ethics approval for this study is not required. We will disseminate the results of this review in peer-reviewed publications and conference presentations, and at a doctoral independent study of oral defence.


This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • A systematic and thorough literature search.

  • Duplicate screening and data extraction.

  • The use of pre-existing criteria for statistical reporting quality.

  • We do not plan to include grey literature in this review, only published protocols.


The primary goal of clinical research should be the effective translation of knowledge into practice. There are currently a number of aspects that can limit the effective translation of quality scientific knowledge into clinical practice, and these may present potential issues to both patient and clinician safety. The quality of reporting in medical literature has increasingly become an area of concern for researchers, journals, and knowledge users. Poor reporting can lead to inadequate presentation of data, confusion regarding research methodology used, selective results reporting, and other misinformation regarding health research.1 ,2 Additionally, methodological assessments evaluating the potential risk of bias of a study are dependent on how well authors describe their methods.3 Users of medical literature are limited in their ability to properly critically appraise reports of interventional trials if the methodology is unclear and this can impact the efficacious use of clinical research to guide patient care. There is some evidence that trials that adhere to reporting guidelines have a higher number of citations,4 possibly indicating that better reporting quality leads to improved knowledge uptake.

Beginning with the original CONSORT statement in 1996,2 ,5 there have been many attempts to standardise and improve reporting of different types of studies through guidelines (see A recent scoping review found that the vast majority of studies adhere poorly to reporting guidelines; however, studies in higher impact journals have better adherence.6 Many top-tier journals now require adherence to these guidelines in order for studies to be published.7 Adherence to reporting guidelines is often a condition for publishing in higher impact journals, which in turn may lead to improved knowledge dissemination and uptake. Given that it has become increasingly popular to publish protocols of studies, randomised controlled trials (RCTs) in particular, it is important that reporting guidelines for study protocols be developed. The Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) Group was founded to systematically develop evidence-based guidelines on how to properly report important study elements in trial protocols.1 In 2013, the SPIRIT Group published a checklist to establish minimum requirements for reporting key study details in protocols with the aim of assisting researchers to publish protocols, improve protocol review by ethics committees and granting agencies, and improve the transparency of trial methodology.1 The SPIRIT Group also published an explanation and elaboration statement that provides detailed rationale and examples.8

Statistical methodology reporting is a particular area of concern. For example, among papers published in a major psychological journal, only 17% adequately defined their primary or secondary outcome analyses.9 Similar studies have also revealed statistical reporting deficiencies in diabetes research,10 trials using logistic regression,11 obstetrics and gynaecology,12 and other fields. Chan et al13 found that fewer than 20% of randomised trials reported all required aspects of a sample size calculation in both the protocol and published paper. Additionally, there were many unexplained discrepancies between statistical methods reported in the protocol and published paper, that have led the authors to recommend improving the content of protocols and making them widely available.13 We hypothesise that statistical reporting in protocols is poor perhaps because it takes highly specialised knowledge and experience to develop a sound statistical plan in the initial planning stages of a trial. There have been no known studies to date that investigate the reporting quality of surgical RCT protocols, especially with a focus on the statistical methods used.


This report is a protocol for a systematic review. The objectives of the systematic review are to assess the reporting quality (completeness) of planned statistical aspects of published surgical RCT protocols. We also aim to determine factors associated with higher reporting quality. Our overarching goal is to determine which aspects of reporting are commonly deficient so that we can make recommendations to improve future research.


Study design

This study will be a systematic review of RCT protocols. This protocol adheres to the preferred reporting items for systematic reviews and meta-analyses protocols (PRISMA-P) reporting checklist for systematic review protocols.14

Eligibility criteria

We will include all published protocols of RCTs that investigate surgical interventions. Eligible studies will meet the following criteria:

  1. The study must be a protocol or methods paper (a report of the rationale and design of the trial that does not report trial efficacy or effectiveness outcomes).

  2. It must report on a RCT, including cluster RCTs, factorial RCTs, and other RCT designs.

  3. The RCT must investigate efficacy or effectiveness of a surgical intervention. We will define surgery as any interventional procedure that changes anatomy and requires a skin incision or the use of endoscopic techniques.15 We will exclude trials of perioperative pharmacological interventions, postsurgical rehabilitation, and interventional radiology procedures. We will include trials of devices and instruments if these were used in surgical procedures.

  4. The RCT must not be a pilot or feasibility trial. We defined a pilot or feasibility trial as one where the primary outcome is to determine the feasibility of conducting a definitive efficacy or effectiveness intervention trial (ie, the primary outcome is not efficacy or effectiveness).16

  5. Must be published in January 2005 or later.

No exclusions will be made on the basis of geographic region, patient population, or language.


We will include surgical interventions from any surgical subspecialty. Specifically, we will include studies investigating two different surgical treatments (eg, total hip arthroplasty vs hemiarthroplasty17), surgical treatments versus non-surgical treatments (eg, tonsillectomy vs conservative management18), or early surgery versus delayed surgery/timing of surgery (eg, early vs standard timing of surgery for subarachnoid haemorrhage19).

Search strategy

We will search MEDLINE, EMBASE, and CENTRAL for relevant studies. Our full proposed search strategies can be found in tables 13. For MEDLINE and EMBASE, we will use a modification of the Scottish Intercollegiate Guidelines Network (SIGN) RCT filters ( We enlisted the assistance of a professional health sciences librarian to develop these search strategies.

Table 1

MEDLINE search strategy

Table 2

EMBASE search strategy

Table 3

CENTRAL search strategy

Reference lists

We will manually search reference lists of all included studies to search for potentially missed protocols. Relevant articles from the reference lists will be reviewed in duplicate, and included as appropriate.

Grey literature

We will not attempt to include grey literature, such as conference abstracts or unpublished reports, because the objective of this review it to assess the reporting quality of fully published protocols.


We will use Distiller SR online software ( to create a study database. Distiller SR has the ability to store titles, abstracts, and full-texts of references imported from reference management software, and it allows for independent review of titles, abstracts, and full-texts in duplicate. Additionally, data can be directly extracted into forms created within the software.


All article reviewers will be trained by the lead author (KM) to ensure reviews are consistent across reviewers. All reviewers will be provided with a manual of instructions, including screenshots from the Distiller SR software and examples.

Author pairs will independently review all titles identified by the literature search. They will be instructed to err on the side of inclusion (ie, if they are unsure of whether the reference should be included, they will be instructed to include it). We will not resolve conflicts or report agreement statistics at the title review stage. Instead, all titles that at least one reviewer marks for inclusion will be included for the abstract review stage. Two reviewers, one methods expert and one surgical expert will independently review all abstracts and full texts that were identified as possibly relevant in the title review stage. Disagreements at the abstract and full-text review stage will be resolved by discussion towards consensus or by consulting a surgery expert (NE), if necessary. Excluded articles will be noted in a study flow diagram with reasons for exclusion. We will report reviewer agreement for inclusion at the title and abstract/full-text stage using the weighted κ-statistic calculated by Distiller SR.

Data extraction

Author pairs will independently extract data using structured data extraction forms designed for this review. We will pilot test the data extraction form on five randomly selected full texts before proceeding with full data extraction to ensure all reviewers extract data consistently, and to ensure the data extraction form is unambiguous and free from errors. We will extract the following: year of publication, country, sample size, description of study population, description of intervention and control, primary outcome, important methodological characteristics, and quality of reporting of the planned statistical methods. Our proposed data extraction form can be found in online supplementary appendix A. We will attempt to contact corresponding authors by email to clarify missing or unclear study characteristics, as needed. However, we will not attempt to contact authors for unreported methodological or statistical aspects, as the purpose of the study is to determine reporting quality.

Primary outcome: evaluation of statistical reporting quality

We will use the statistical methods section of the SPIRIT checklist—modified to include a higher level of detail based on the SPIRIT explanation and elaboration document8—to evaluate quality of statistical methods reporting. Two independent reviewers with master's level statistical training or higher will determine whether the following items are adequately reported (see online supplementary appendix A):

  • Primary analysis—Protocols should report the planned statistical test for the primary outcome, the intended effect measure (eg, HR, relative risk), significance level, and intended use of precision measures.

  • Secondary analyses—Protocols should report the planned statistical test(s) for the secondary outcome(s), the intended effect measure (eg, HR, relative risk), significance level, and intended use of precision measures.

  • Sample size—Protocols should report the planned number of participants to be included in the trial along with the methods used to determine that number, and an explanation of the assumptions underlying the calculations. Cluster, factorial, crossover, non-inferiority, adaptive designs, and other special designs must have additional explanations for their sample size based on their design (eg, intracluster correlation coefficient for cluster trials).

  • Subgroup analyses—Protocols should report the number of planned subgroup analyses, variables analysed, rationale, definitions of subgroup categories, and whether a test of interaction is planned. If no subgroup analyses are planned, this should also be explicitly stated.

  • Adjusted analyses—Protocols should specify whether they plan to conduct adjusted analyses, and which variables will be used or which objective criteria for determining variable selection. If adjusted analyses include continuous variables, protocol should state how they will be handled. If no adjusted analyses are planned, this should be explicitly stated.

  • Sensitivity analyses—Protocols should report whether sensitivity analyses are planned, and the methods of planned sensitivity analyses. If no sensitivity analyses are planned, this should be explicitly stated.

  • Interim analysis—Protocols should report whether they plan to conduct interim analyses, the timing of the planned interim analyses, what data will be analysed, and whether any adaptations will be made to the study design based on interim analysis results. If interim analyses will only be conducted at the request of a Data Monitoring Committee (or similar), this should be explicitly stated.

  • Stopping guidelines—Protocols should report the statistical criteria or decision criteria for stopping trials for reasons of futility and harm/benefit, and state who has the final authority over stopping the trial early.

  • Analysis population—Protocols should explicitly state their intended analysis population (eg, all randomised participants regardless of protocol adherence, only participants who adhered to the protocol, etc). Use of the terms ‘per protocol’ and ‘intention-to-treat’ are inadequate unless further defined, as they are often used ambiguously.

  • Missing data—Protocols should state methods planned to account for missing data (eg, multiple imputation). If the investigators do not plan to use statistical methods to handle missing data, this should be explicitly stated.

  • Multiplicity—Protocols that include comparisons of more than two groups, or measuring the same variable at several time points, or planned interim analyses, or studies evaluating more than one outcome should state any methods planned to account for multiple testing.

Disagreements at the data extraction stage will be resolved by discussion towards consensus or by consulting a senior statistician (LT) if consensus cannot be reached.

Other methodological characteristics

Author pairs will independently assess important methodological characteristics for each included study. We will extract selected items from the Cochrane Risk of Bias tool supplemented with items from the OrthoEvidence risk of bias tool (, which is specific to surgical trial methodology. We removed the items for attrition bias (incomplete outcome data) and reporting bias (selective outcome reporting) criteria because there are no results reported in protocols, and we added items for expertise bias, outcome selection, planned sample size, planned statistical analyses, and potential conflicts of interest. We did not use the SPIRIT guidelines for the overall methodological quality because SPIRIT focuses only on correct reporting of items, not on whether the items could contribute to bias.

Data analysis

Refer to table 4 for a summary of the analysis plan.

Table 4

Planned statistical analysis

Primary analysis

The primary analysis will be a descriptive analysis of statistical reporting quality. For each item we will report frequencies and percentages that are reported and not reported.

Secondary analyses

We will perform a Poisson regression analysis (or negative binomial regression, if required, to account for overdispersion) to determine which study characteristics are associated with greater statistical reporting. The dependent variable will be the counts of adequately reported items on the extended SPIRIT statistical checklist (see online supplementary appendix A). Independent variables will include the following that have been linked to reporting quality in previous studies: (1) year;20 (2) journal impact factor;6 ,20 (3) industry funding (binary);6 ,20 (4) sample size;6 (5) statistician author listed versus no statistician author listed (binary) and (6) multicentre versus single centre (binary).6 We will also include the following independent variables, which we hypothesise are associated with reporting quality if they reach statistical significance (p<0.05) in the model with all variables included: (1) number of authors; (2) whether the authors reported trial registration number and (3) type of surgical study. We will use the likelihood ratio test to determine whether to use the full or reduced model. We will use variance inflation factors (VIF) to test for multicollinearity (defined as VIF>10). Results will be reported as β coefficients with 95% CIs and p values for each included variable, plus overall model fit statistics.

Additionally, we will conduct an exploratory analysis of whether statistical reporting has improved after publication of the SPIRIT statement (ie, 2013 and later). We do not have any subgroup or sensitivity analyses planned. We do not anticipate any issues with multiplicity. We will not impute for missing data in the regression analyses. Since we anticipate very little missing data, we will use the complete case analysis.

Updates and amendments

Updates and amendments to this protocol (if applicable) will be summarised in the final systematic review manuscript.


Reporting quality is important because it is required for transparency and critical appraisal, and therefore is necessary for meaningful application of results to practice. Our overall goal is to make recommendations to improve future research. The results of this review will allow researchers, journal editors, and knowledge users to understand the quality of statistical reporting of published surgical RCT protocols. This knowledge will allow us to make recommendations regarding how surgeons and researchers should report their statistical methods in protocols, and it will enable journal editors and peer reviewers to focus on common errors and omissions in reporting and rectify them.

Ethics and dissemination

Ethics approval for this study is not required. We will disseminate the results of this review in peer-reviewed publications and conference presentations, and at a doctoral independent study oral defence.


The authors would like to acknowledge Ms Neera Bhatnagar for her assistance with the development of the search strategy. The screening and abstraction team members are: Caley Laxer (Team Leader), Mark Gichuru, Kerry Tai, Rajeev Jetly, Melissa Spadafora, Winston Dang, Eveline Dieleman, Achint Bajpai, Lydia Ginsburg and Brendan Sales.



  • Contributors KM and LT conceived the study design. EA, CSS and NE refined the study design. KM drafted the manuscript, and all the authors revised for critical content. All the authors reviewed and approved the final draft. KM and LT will act as guarantors for this study.

  • Funding KM and NE are funded by Canadian Institutes of Health Research (CIHR) Doctoral Scholarships (grant number GSD134929).

  • Disclaimer The funders had no role in designing, drafting, or approving this protocol.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.