Article Text
Abstract
Background The Cochrane risk of bias tool is a prominent instrument used to evaluate potential biases in clinical trials. In three updates of our Cochrane review on neuraminidase inhibitors, we assessed risk of bias on the same trials using different levels of detail: the trials in journal publications, in core reports, and in full clinical study reports. Here we analyse whether progressively greater amounts of information and detail in full clinical study reports (including trial protocols, statistical analysis plans, certificates of analyses, individual participant data listings and randomisation lists) affected our risk of bias assessments.
Methods and findings We used the Cochrane risk of bias tool to assess and compare risk of bias in 14 oseltamivir trials (reported in 10 clinical study reports) obtained from the European Medicines Agency (EMA) and the manufacturer, Roche. With more detailed information, reported in clinical study reports, no previous assessment of ‘high’ risk of bias was reclassified as ‘low’ or ‘unclear’ in the main analysis, and over half (55%, 34/62) of the previous assessments of ‘low’ risk of bias were reclassified as ‘high’. Most assessments of ‘unclear’ risk of bias (67%, or 28/42) were reclassified as ‘high’ risk of bias when our judgements were based on full clinical study reports. The limits of our study were our relative inexperience in dealing with large information sets, sometimes subjective bias judgements and focus on industry trials. Comparison with journal publications was not possible because of the low number of trials published.
Conclusions We found that as information increased in the document, this increased our assessment of bias. This may mean that risk of bias has been insufficiently assessed in Cochrane reviews based on journal publications.
- STATISTICS & RESEARCH METHODS
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Strengths and limitations of this study
The availability of full clinical study reports decreased the uncertainty of bias judgements and allowed clearer judgements to be made.
The availability of full clinical study reports allows reviewers to follow consistency across chapters and appendices, creating a need for far more interaction with the text.
Our relative inexperience in dealing with large quantities of information and our lack of familiarity with certain trial documents may limit our ability to assess risk of bias in clinical study reports.
The current Cochrane risk of bias tool is not adequate for the task as it does not reliably identify all types of important biases, and nor does it organise and check the coherence of large amounts of information. This may have impacted our findings.
The custom data extraction sheet we have developed is for use with clinical study reports, and may not apply to non-industry trials where clinical study reports usually do not exist.
Introduction
The risk of bias tool in Cochrane reviews of randomised trials is routinely used to assess essential items pertaining to validity of trial design such as random sequence generation, allocation concealment, attrition and performance biases. There are six standard bias elements, each rated at a ‘high’, ‘low’ or ‘unclear’ risk of bias.
As Cochrane reviews are typically based on synthesising studies based on reports published in the scientific literature, the risk of bias tool is traditionally applied to journal publications. To the best of our knowledge, the ways in which risk of bias judgements change when they are based on more detailed reports of trials, such as those contained in clinical study reports, have not been previously investigated.
Clinical study reports are considered the most exhaustive summaries of randomised controlled trials of pharmaceuticals. Clinical study reports are highly structured and detailed documents that follow an outline format agreed between regulators and manufacturers in 1995, described in the ICH E3 document.1 ,2 Recent transparency policies adopted by the European Medicines Agency (EMA),3 as well as announcements by some pharmaceutical companies to make clinical study reports more readily available, 4 ,5 suggest that clinical study reports may increasingly be incorporated into systematic reviews and other forms of evidence synthesis.
Although there is some variation in the structure and content of clinical study reports, they are usually composed of a core report of the trial and appendices. A core report (sections 1–15 of the ICH E3 document) is structured in the Introduction, Methods Results and Discussion (IMRAD) style. The numerous appendices (section 16 of ICH E3) contain important online supplementary data needed to understand and interpret the trial, its context and history.1 ,2 These appendices include such documents as the trial protocol, protocol amendments, statistical analysis plan, blank case report forms, certificates of analysis, randomisation lists and consent forms. For the purposes of this paper, the core report plus all its appendices will be known as the full clinical study report (see online supplementary appendix 1 for the table of contents of a typical oseltamivir clinical study report and http://dx.doi.org/10.5061/dryad.77471 for a free download of all the clinical study reports used in our review and featured in this paper. The core report was known as Module 1 in oseltamivir clinical study reports, and appendices were found in Modules 2–5). Core reports and full clinical study reports theoretically can help reduce uncertainty in judging risk of bias.
In 2012, we published an update of our Cochrane review of neuraminidase inhibitors which included a total of 32 oseltamivir trials.6 Unlike most Cochrane reviews, this review was based only on core reports,6 and risk of bias assessments were therefore based on each core report. Subsequently, in 2013, we obtained full clinical study reports from Roche and, as part of a further systematic review update, carried out new risk of bias assessments of the same trials based on the full clinical study reports.
Our overall aim was to investigate whether the level of detail contained in reports of trials affects judgements about risk of bias. We planned to achieve this by comparing documents which contain increasingly detailed information on each trial included in our review, namely journal publications, core reports and full clinical study reports. As well as using the standard Cochrane risk of bias tool, we developed an additional list of study elements that we wanted to extract in order to allow improved assessments of each trial's design and conduct and facilitate the organisation of large quantities of information now available to us.
In this report, we describe our use of these tools to address three specific questions:
Do core reports change the risk of bias evaluation compared to published papers?
Do full clinical study reports change the risk of bias evaluation compared to core reports?
Do full clinical study reports change the risk of bias evaluation compared to published papers?
Methods
Ten core reports (M76001; NV16871; WV15670; WV15671; WV15707; WV15730; WV15759/WV15871; WV15799; WV15812/WV15872; WV15819/WV15876/WV15978) were received in PDF files from Roche and EMA by 12 April 2011 (the date of time lock for our 2012 Cochrane review).6 The reporting of more than one trial in the same clinical study report was justified by Roche as a consequence of lower than expected participant recruitment due to low influenza circulation and consequently a need to pool studies.
The current Cochrane risk of bias tool consists of six domains; each may have more than one source of bias application, depending on the subject matter.7 Our applications were as follows: selection bias (random sequence generation and allocation concealment), performance bias (blinding of participants and personnel—all outcomes), detection bias (blinding of outcome assessment—all outcomes), attrition bias (influenza symptoms, complications and harms outcome data), reporting bias (selective reporting) and other bias. The identification of sources of other bias was left at the reviewers’ discretion.
Risk of bias assessments were performed following Cochrane methods 7 and published in 2012.6 In that review, risk of bias was assessed by an external reviewer on the basis of data extracted from core reports.
After 12 April 2011, we obtained the appendices of the clinical study reports included in our review. For most of the clinical study reports we requested, EMA had the protocol, protocol amendments, statistical analysis plan, blank case report forms and other appendices contained in what Roche terms the second ‘module’ of a full clinical study report (see online supplementary appendix 1). However, EMA did not possess—and therefore could not provide us with—full clinical study reports with the exception of trial WP16263.8 For approximately 3 years, Roche had repeatedly refused our requests for full clinical study reports.9
In April 2013 in the course of carrying out these new extractions, Roche changed its policy on access to data and pledged to share with us 77 full clinical study reports (http://www.bmj.com/tamiflu/roche). Fifteen clinical study reports containing 20 trials were included in the analysis of our current review.10 As we were already in possession of core reports and appendices such as the protocol and statistical analysis plan for the 14 trials in this analysis, the additional data for other clinical study reports provided by Roche do not concern this paper. In the clinical study reports, Roche redacted information that they judged to be of ‘legitimate commercial interest’ or present a risk of trial participant reidentification. The redactions did not impede our analyses of risk of bias.
On the basis of our growing familiarity with clinical study reports, we designed and piloted a data extraction sheet to record how our understanding of the trials changed in the light of the availability of the additional appendices. We realised that, in addition to the standard Cochrane risk of bias elements, we needed to organise the abundant material at our disposal and reconstruct a timeline of the trials. We used the Cochrane risk of bias tool 7 to appraise clinical study reports and a data extraction sheet for recording information relevant to this appraisal. We added the following elements to our extraction sheets: date of participant enrolment, unblinding of the trial, protocol for which we had the full text, protocol amendments, statistical analysis plan for which we have the full text (and its amendments), patient consent form, randomisation list and certificate of analysis. Timeline reconstruction allowed us to conceptualise the design and conduct of the trials and appreciate their role in the trial programme with their strengths and limitations. In addition, following a timeline allows a judgement to be made on the integrity and temporal sequence of the documents. The finalised extraction sheet is in online supplementary appendix 2.
On the basis of access to the full clinical study reports, we carried out our final assessment of risk of bias. These were carried out by a single reviewer, checked by a second with final consensus reached through a face-to-face discussion among the entire group.
Since with full clinical study reports there should be no ambiguity, we only allowed ‘low’ or ‘high’ risk of bias judgements (ie, no ‘unclear’). We adopted the position that, unlike a publication which may have page limits, there was no reason why a full clinical study report should be missing details necessary for a third party to judge risk of bias. Therefore, when information that would have otherwise allowed us to judge a risk of bias as either ‘low’ or ‘high’ was missing, this would automatically be categorised as ‘high’ risk of bias. This decision to eliminate the ‘unclear’ option when assessing full clinical study reports was made following an initial assessment of the trials, which included ‘unclear’ judgements. On the basis of an earlier peer review of this paper, which suggested we analyse the data had we kept the ‘unclear’ category, we also carried out this post hoc analysis.
To allow for a comparison of risk of bias judgements based on published reports of trials and risk of bias judgements based on clinical study reports (either core reports alone or full clinical study reports), we used our previous risk of bias judgements for the same trials in the relevant Cochrane reviews that had been based on publications.11 ,12
The extraction and adjudication methods used were the same as those used in our subsequent unified Cochrane review.6 We used descriptive methods to answer our three questions without the need for formal statistical analysis.
Ethics approval and patient consent were not necessary for this study.
Results
We could only compare risk of bias assessments between core reports and full clinical study reports for the following 14 trials (reported in 10 clinical study reports): M76001; NV16871; WV15670; WV1Z5671; WV15707; WV15730; WV15759/WV15871; WV15799; WV15812/WV15872; WV15819/WV15876/WV15978 (figure 1 and table 1).
We could not carry out a comparison of risk of bias judgements of journal publications with core reports or full clinical study reports, because our assessments were largely based on secondary publications (notably, the Kaiser et al pooled analysis of 10 trials, 8 of which were unpublished13) rather than primary publications of the trials, and also utilised an outdated risk of bias tool. Hence, there were too few studies (3) for which we had distinct risk of bias judgements of primary journal publications (many studies for which we have clinical study reports were and remain unpublished, eg, 8 of the 13 trials in adults). In addition, the current Cochrane risk of bias tool was introduced after the production of our review of published articles, making the comparison, had we had the data to undertake it, more difficult to interpret and possibly unfair.
For the comparison of core and full clinical study reports, table 2 shows that no previous assessment of ‘high’ risk of bias was reclassified as ‘low’ or ‘unclear’ in the presence of more detailed information. Previous assessments of ‘low’ risk of bias were not uncommonly reclassified as ‘high’ bias in the subsequent assessment. While our assessments based on core reports were mostly classified as ‘low risk of bias’, they were reclassified in the opposite direction as ‘high’ risk of bias when our judgements were based on full clinical study reports (table 2).
A spreadsheet recording all individual risk of bias judgements is available online (see online supplemental file 1).
Had we kept the ‘unclear’ risk of bias judgement option when assessing full clinical study reports, 10 we would have had 64 ‘unclear’ judgements (see sensitivity analysis in table 3). The breakdown of these 64 judgements into the various attributes is:
Attrition bias: symptoms (10); complications (9); safety (15). These are unclear because we do not know the impact of missing symptoms data, and the reports contained unclear definitions for secondary complications of influenza and a seemingly problematic decision tool for the alternative designation of events as either complications or harms, which we called ‘compliharms’ in our Cochrane review.
Other bias (13)—these are unclear due to the unknown effect of the dehydrocholic acid included in the placebo but are not included in the active treatment.
Performance bias (6)—these are unclear due to missing certificates of analysis describing the placebo appearance.
Selection bias (10)—these are unclear due to the missing or unclear randomisation lists, meaning we cannot confirm random sequence generation.
Detection bias (1)—this is unclear due to the unknown impact of different coloured placebo caps on outcome assessment.
See tables 3 and 4. Twenty-nine per cent of previously certain judgements (ie, ‘high’ or ‘low’ risk of bias) based on core reports became ‘unclear’ with full clinical study reports.
An example of the kind of detail available in full clinical study reports, and the importance of the trial timeline in assessing the presence of bias, is the observation that of the clinical study reports for the 14 trials, only 1 contained a protocol which predated the beginning of participant enrolment, only 2 had statistical analysis plans which clearly predated participants enrolment and 3 had clearly dated protocol amendments. No clinical study report reported a clear date of unblinding. Completed extraction sheets with risk of bias comparisons and rationales are available on request from the corresponding author.
Discussion
We used the Cochrane six-item risk of bias instrument to assess bias from two different levels of detail of trial reports. Owing to the unrestricted access to full clinical study reports, we took the view that all information needed to judge risk of bias for each of the six domains of the Cochrane risk of bias should be present. When the information was not available, we judged the corresponding risk of bias element as being ‘high’. Therefore, the availability of full clinical study reports decreased the uncertainty and allowed clearer judgements to be made. Risk of bias previously assessed as ‘unclear’ based on core reports became a more certain ‘low’ or ‘high’ risk of bias. When the information was not available, our judgements changed because we found gaps in the availability of information and inconsistent information. Whether the full study reports represent an exhaustive and coherent source of trial narrative and data remain unclear.
Throughout our study, we were assessing two different types of material within the clinical study reports: those that were created or written prior to patient enrolment (eg, trial protocols), and those written after (eg, core reports).
This approach is not possible when assessing trials reported in journal publications, in which articles necessarily reflect post hoc reporting with a far more sparse level of detail. We suggest that when bias is so limiting as to make meta-analysis results unreliable, either it should not be carried out or a prominent explanation of its clear limitations should be included alongside the meta-analysis. We found the Cochrane risk of bias tool to be difficult to apply to clinical study reports. We think this is not because the tool was constructed to assess journal publications but, as with all list-like instruments, its use lends itself to a checklist approach (in which each design item is sought and, if found, eliminated from the bias equation rather than with thought and consideration). Similarly, the extraction sheet we assembled needs to be applied with thought and consideration—an approach that does not lend itself to reviewing under time pressure. However, more focus should be devoted to bias itself and its effects rather than theoretical risk of bias. Many of the variables we found to be important when assessing the trial (eg, date of trial protocol, date of unblinding, date of participant enrolment) are simply not captured in the risk of bias tool when used in a routine way or to review publications. We were also often unsure how to judge the risk of bias when bias itself can actually or potentially be measured with reviewers’ access to full clinical study reports and individual participant data. If, for example, the original trial protocol is available, one can judge whether reporting bias occurred. Reviewers need not guess at bias (ie, make a judgement of ‘risk’) but can judge bias directly. However, even with individual participant data, some forms of bias, such as attrition bias, may still be difficult to quantify, and one can only judge the risk (ie, potential) of bias. Therefore, access to detailed information and participant level data sometimes found in full clinical study reports provides an opportunity to consider both actual as well as risk of biases.
Box 1 shows examples of the types of information found in clinical study reports that led to risk of bias assessment changes. While the judgements of ‘low’ or ‘high’ risk of bias may imply certainty, particularly when based on the reading of a full clinical study report, we found ourselves often in lengthy debate and discussion over the proper level of risk of bias before arriving at a consensus. We found the risk of bias judgements themselves to carry a high level of subjectivity, in which different judgements can be justified in different ways. The real strength of the risk of bias tool appears not to be in the final judgements it enables, but rather in the process it helps facilitate: critical assessment of a clinical trial.
Examples of risk of bias assessment changes and other concerns
In trial WV15708, the risk of bias related to allocation concealment went from ‘Unclear’ based on core reports to ‘High’ risk of bias based on full clinical study reports because the full clinical study report did not report sufficient details about the method of allocation concealment.
In trial WV15707, the risk of bias related to random sequence generation went from ‘Unclear’ based on core reports to ‘High’ risk of bias based on full clinical study reports because a full description of the randomisation procedure was not provided.
Prophylaxis trials WV15673 and WV15697 are described as ‘identical’, but this could not be verified as we had only one protocol (and the protocol we did have was dated after the study's completion). In addition, the placebo event rates for influenza infection were very different between the two trials and their pooling, combined with the redaction of centre numbers, preventing from them being individually added to a meta-analysis. Therefore, our assessment of the ‘Other’ risk of bias item changed from ‘unclear’ based on core reports to ‘high’ based on full clinical study reports.
In the treatment trials WV15819, WV15876 and WV15978, it was difficult to reconcile the total number of hospitalisations despite access to the full clinical study reports. One patient in the placebo arm who was hospitalised according to serious adverse event narratives does not appear in the hospitalisations table, and for a separate placebo patient who is listed in the serious adverse event narratives, no hospitalisation is described in this narrative, but the same patient was hospitalised according to the hospitalisations table. It was therefore unclear how many hospitalisations occurred in the trial, to whom and why.
In prophylaxis trials WV15673 and WV15697, bias was assessed as low for selective reporting because the intention-to-treat population was described and reported in a table. However, when the full clinical study report became available, we realised that the original protocol was missing.
Another aspect to emerge is that tools based on publications are designed to detect the presence, absence or uncertainty regarding elements in a very restricted number of places in the text. The availability of full clinical study reports allows reviewers to follow consistency across chapters and appendices, creating a need for far more interaction with the text. An example of this active engagement is the cross-checking of active principle and placebo batches used across trials and their connection with a visual description of their properties such as colour in a certificate of analysis. For example, once the presence of a differently coloured placebo capsule cap in trial WP16263 was identified through the clinical study report's certificate of analysis, its potential impact on blinding was captured in the Cochrane instrument. The interpretation of such a finding is difficult, as the colours of the active principle and placebo capsule caps are close (ivory and light yellow). However, publication-based or core report only based assessments would not have identified the potential differences in colour as the descriptions are simply given as ‘placebo’14 and ‘matching placebo’,15 respectively. Reviewing the complete clinical study reports and our assessment of bias was very time consuming, necessitating prolonged exchanges including a face-to face meeting given the novelty of what we were doing. However, this activity was not as difficult or as time consuming as the reconstruction of trial evidence programmes for oseltamivir, an activity which necessitated a whole time equivalent researcher for 6 months. However, owing to the threat of reporting bias, we can think of no alternative to the use of full clinical study reports.
The main limitation of our study is our relative inexperience in dealing with large quantities of information and our lack of familiarity with certain trial documents such as randomisation lists. Randomisation lists appeared to be of two types. The first was a prerandomisation list of random codes with which participants’ IDs cannot be matched with the participant IDs used within other sections of the clinical study report. The second was a post hoc randomisation list to which individual participants can be matched, but the original generated codes are not shown. In both cases, the truly random generation of the sequence could not be properly assessed because either the original codes are not provided or they cannot be matched to patients. Another limitation of our study is that the instrument we have developed is for using with clinical study reports, and may not apply to non-industry trials (which may not have a clinical study report).
The background to our use of clinical study reports was our mistrust of journal publications of oseltamivir trials. Many trials were unpublished, and of those published, we found and documented examples of reporting bias. At least one trial publication was drafted by an unnamed medical writer. As evidence of reporting bias in industry trial publication mounts, 8 ,16–21 we believe Cochrane reviews should increasingly rely on clinical study reports as the basic unit of analysis. Sponsors and researchers both have a responsibility to make all efforts to make full clinical study reports publicly available. The systematic evaluation of bias or risk of bias remains an essential aspect of evidence synthesis, as it forces reviewers to critically examine trials. However, the current Cochrane risk of bias tool does not sufficiently identify possible faults with study design, and nor does it help to organise and check the coherence of large amounts of information that are found in clinical study reports. Our experience suggests that more detailed extraction sheets that prompt reviewers to consider additional aspects of study may be needed. Until a more appropriate guide is developed, we offer our custom extraction sheets to Cochrane reviewers and others interested in assessing risk of bias using clinical study reports and encourage further development.
Acknowledgments
The authors thank Toby Lasserson for providing advice and an independent check of our risk of bias judgements.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Data supplement 1 - Online supplement
- Data supplement 2 - Online appendix
Footnotes
Contributors TJ, MAJ, CJH and PD designed the custom data extraction sheet. All authors extracted the data as described and interpreted it. MAJ carried out statistical analyses. TJ wrote the first draft of the manuscript and all authors contributed to subsequent drafts. All authors were also involved in the design of the study.
Funding This project was funded by the NIHR Health Technology Assessment programme and will be published in full in the Health Technology Assessment journal series.
Competing interests TJ receives royalties from his books published by Blackwells and Il Pensiero Scientifico Editore, Rome. He is occasionally interviewed by market research companies for anonymous interviews about Phase 1 or 2 pharmaceutical products. In 2011–2013, TJ acted as an expert witness in a litigation case related to an antiviral (oseltamivir phosphate; Tamiflu (Roche)) and in a labour case on influenza vaccines in healthcare workers in Canada. In 1997–1999, he acted as consultant for Roche, in 2001–2002 for GSK and in 2003 for Sanofi-Synthelabo for pleconaril (an antirhinoviral which did not get approval from FDA). TJ was a consultant for IMS Health in 2013 and is currently retained as a scientific advisor to a legal team acting on the drug Tamiflu (oseltamivir, Roche). He recently had part of his expenses reimbursed for attending the annual (UK) Pharmaceutical Statisticians’ Conference. PD received €1500 from the European Respiratory Society in support of his travel to the society's September 2012 annual congress in Vienna, where he gave an invited talk on oseltamivir. He is an associate editor at The BMJ. CBDM was a Board member of two companies to commercialise research at Bond University, part of his responsibilities as Pro-Vice Chancellor (Research) until 2010, and receives fees for editorial and guideline developmental work and royalties from books and in receipt of institutional grants from NHMRC (Aus), NIHR (UK) and HTA (UK) and from a private donor (for support of the editorial base of the Cochrane ARI Group). RH receives royalties from two books published in 2008 titled ‘Tamiflu: harmful as was afraid’ and ‘In order to escape from drug-induced encephalopathy’. RH provided scientific opinions and expert testimony on 11 adverse reaction cases related to oseltamivir and gefitinib. CJH is provided financial support by The National Institute of Health Research (NIHR) School of Primary Care Research (SPCR).
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement The source core reports and clinical study reports can be found at http://datadryad.org/resource/doi:10.5061/dryad.77471. A spreadsheet recording all individual risk of bias judgements is available in an online supplemental file to this paper.