Objective To estimate the proportion of systematic reviews that meet the optimal information size (OIS) and assess the impact heterogeneity and effect size have on the OIS estimate by type of outcome (eg, mortality, semiobjective or subjective).
Methods We carried out searches of Medline and Cochrane to retrieve meta-analyses published in systematic reviews from 2010 to 2012. We estimated the OIS using Trial Sequential Analysis software (TSA V.0.9) and based on several heterogeneity and effect size scenarios, stratifying by type of outcome (mortality/semiobjective/subjective) and by Cochrane/non-Cochrane reviews.
Results We included 137 meta-analyses out of 218 (63%) potential systematic reviews (one meta-analysis from each systematic review). Of these reviews, 83 (61%) were Cochrane and 54 (39%) non-Cochrane. The Cochrane reviews included a mean of 6.5 (SD 6.1) studies and the non-Cochrane included a mean of 13.2 (SD 10.2) studies. The mean number of patients was 2619.1 (SD 6245.8 or median 586.0) for the Cochrane and 19 888.5 (SD 32 925.7 or median 6566.5) patients for the non-Cochrane reviews. The percentage of systematic reviews that achieved the OIS for all-cause mortality outcome were 0% Cochrane and 25% for non-Cochrane reviews; for semiobjective outcome 17% for Cochrane and 46% for non-Cochrane reviews and for subjective outcome 45% for Cochrane and 72% for non-Cochrane reviews.
Conclusions The number of systematic reviews that meet an optimal information size is low and varies depending on the type of outcome and the type of publication. Less than half of primary outcomes synthesised in systematic reviews achieve the OIS, and therefore the conclusions are subject to substantial uncertainty.
- systematic review
- optimal information size
- trial sequential analysis
- effect size
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
- systematic review
- optimal information size
- trial sequential analysis
- effect size
Strengths and limitations of this study
To our knowledge, this is the first analysis to estimate the optimal information size by type of outcome.
This study includes only systematic reviews from the Cochrane library and the top five general medical journals; therefore, our results may not be generalisable to systematic reviews published in other journals.
The concept of optimum information size (OIS) was first proposed in 1998 by Pogue et al1 2 as ‘the minimum amount of information required in the collective literature for reliable conclusions about an intervention to be reached’. This OIS estimate is based on standard sample size calculations. For example, the required number of participants (information size) for a meta-analysis should match those required in an adequately powered single trial.3 Other measures of information size have been proposed4 5; however, the OIS involves a relatively simple calculation, which under some scenarios will underestimate the information required to define whether firm evidence has been reached to draw robust conclusions.6 Brok et al3 demonstrated, in a subset of Cochrane reviews, that many meta-analyses have false-positive results due to insufficient information, and Turner et al showed that most meta-analysis do not have sufficient power to identify even moderate effects.7 8
Sample size calculation and the OIS are influenced by several variables such as the control event rate (CER) (baseline risk), effect size, the power and the alpha value. Deciding on which values to use can be difficult and is typically based on values observed or estimated from the meta-analysis or one of the included studies. In addition, increased variation can also effect the estimate of the OIS, and there is currently no consensus about which value of heterogeneity should be used to calculate the OIS.
The OIS can help determine the stability of an effect and whether treatment effect estimates are likely to differ based on further information. However, they are difficult to define in advance, and there is no consensus regarding the alpha (significance) or power value used at the outset.
It is therefore not currently known if evidence accumulation and its associated OIS depend on the type of outcome studied and if this varies by publication type (Cochrane or non-Cochrane review). Therefore, we set out to quantify this by studying systematic reviews (SRs) published in the Cochrane Library and the top five general medical journals and in the process describe the impact that observed variation in heterogeneity and effect size (relative risk reduction (RRR)) have on the OIS estimation.
We defined two sets of SRs to evaluate: Cochrane and non-Cochrane. We identified all Cochrane SRs published during 2010–2012 through the Archie Database (http://archie.cochrane.org), which contains all Cochrane published reviews and allows electronic searching. We randomly selected a total of 120 of these based on random numbers generated using Microsoft Excel for inclusion.
To search for non-Cochrane reviews, we identified all SRs with meta-analyses published in the top five general medical journals (The New England Journal of Medicine, Lancet, The Journal of the American Medical Association, Internal Medicine, Annals of Internal Medicine and British Medical Journal) using the following search strategy in Medline (PubMed): ‘BMJ’[Journal] OR ‘Ann Intern Med’[Journal] OR ‘JAMA’[Journal] OR ‘Lancet’[Journal] OR ‘N Engl J Med’[Journal] AND (systematic review [ti] OR meta-analysis [pt] OR meta-analysis [ti]) restricted to SRs published during 2010–2012.
From all the selected Cochrane and non-Cochrane reviews, we included one meta-analysis from each. Based on the order the outcomes were reported (eg, outcome 1.1 for Cochrane SRs), we selected the first outcome presented in the meta-analysis that was based on: binary data from two or more individual studies (clinical trials or randomised controlled trials). If the first outcome did not meet this inclusion criteria, we continued through the listed outcomes until one was identified or we had exhausted the list of outcomes reported (figure 1). Meta-analyses that included observational studies, of diagnostic interventions or that were based on network meta-analysis were excluded. Meta-analyses showing no effect (pooled effect=1) or meta-analyses with no events in all included trials were also excluded.
Full texts were obtained for those abstracts that met the inclusion criteria and assessed for eligibility. One reviewer JMG-A extracted the data, and a second reviewer (RP or NP) checked the data. We developed customised Excel spreadsheets for the data extraction process. From each included meta-analysis, we extracted and calculated the following items: outcome type as defined by Turner9 (‘all cause mortality’, ‘semi-objective’ (cause-specific mortality, major morbidity event) and ‘subjective’ (pain, mental health outcomes)), comparison, number of included patients, number of trials, number of events in each arm, CER, effect size and heterogeneity.
We extracted data from each trial and repeated the meta-analysis using random-effects models (DerSimonian and Laird) to account for potential heterogeneity of effects. Estimates for trials with only one group reporting zero events were adjusted with a constant continuity adjustment of 0.5 in each arm (default adjustment in Revman). The obtained estimates for the pooled effect (eg, RR) and I2 were compared with the published results to detect any relevant disagreement, and if required, the analyses were repeated to identify the source of the difference. Meta-analyses and calculation of the OIS were done using Trial Sequential Analysis software (TSA V.0.9)10 freely downloadable at www.ctu.dk/tsa. The TSA software allows meta-analysis of dichotomous or continuous data under fixed or random-effects models and has the option to estimate an information size and the stopping boundary. This estimation of the OIS is based on the alpha spending method (O’Brien Fleming and Lan-DeMets).
To evaluate the impact of changes in heterogeneity and effect size (RRR), we estimated the OIS under different scenarios. For heterogeneity, we analysed three values of heterogeneity: ‘heterogeneity=rep’ as that reported in the meta-analysis using a random-effects model (or obtained from fitting a random effects model if a fixed effect model was used originally), ‘heterogeneity=0’ and ‘heterogeneity=Q3’ (upper quartile or 75th percentile), which was determined based on estimates of predictive distributions published by Rhodes et al.11 These two estimates of ‘heterogeneity=Q3’ and ‘heterogeneity=0’ were chosen as extreme scenarios to evaluate the impact that this parameter has on the OIS. Consistent with Rhodes et al, the estimation of the OIS took into account the outcome type: ‘all cause mortality’, ‘semiobjective’ (cause-specific mortality and major morbidity event) and ‘subjective’ (pain and mental health outcomes) and, for simplicity, was based on assuming an average mean study size between 50 and 200 participants.
To evaluate the impact of effect size on the OIS, we used two different estimates of the effect size for the meta-analyses with mortality outcome: the RRR obtained in each meta-analysis as well as an a priori conservative value of 5% for the RRR as reported by Djulbegovic et al.12 For the transformation of relative risk (RR) measure to RRR, we used the following formula RRR=1 RR. If the RR was greater than 1, we used the RRR as a negative value. We did not determine an alternative estimate for the effect size for the other two outcomes (semiobjective and subjective) as the distribution of possible effects makes the choice of ‘average effect’ difficult to justify. We used only one value, per meta-analysis, for the baseline risk or CER. This was taken to be the median of the proportion of events in the included trials in each meta-analysis, following the method proposed by Hayden et al.13
We used descriptive statistics and plots to quantify differences in CER, effect size and heterogeneity between Cochrane and non-Cochrane reviews, stratified by type of outcome (six groups in total). We also determined the proportion of reviews that have achieved the OIS based on reported results and our two extreme scenarios ‘heterogeneity=0’ and ‘heterogeneity=Q3’ comparing between Cochrane and non-Cochrane and again stratifying by type of outcome (‘all-cause mortality’, ‘semiobjective’ and ‘subjective’ outcomes).
The descriptive analysis of the characteristics of included meta-analyses was carried out using SPSS V.22 software.
Figure 2 presents a flow chart of the results. We excluded 11 Cochrane SRs due to no events reported in the included trials or due to only one study being included in the review. We included a total of 137 meta-analyses out of 218 (63%) potential SRs (figure 2): 83 (61%) were Cochrane SRs and 54 (39%) non-Cochrane SRs. The Cochrane reviews included a mean of 6.5 (SD 6.1) studies and the non-Cochrane included a mean of 13.2 (SD 10.2) studies. The number of patients was 2619.1 (SD 6245.8 or median 586.0) for the Cochrane and 19 888.5 (SD 32925.7 or median 6566.5) patients for the non-Cochrane reviews.
Scenarios under different parameter estimates
Table 1 provides results on the types of outcomes and type of intervention studied for the included meta-analyses by publication type.
Of the included meta-analyses, 26 (19%) used ‘all cause mortality’ as an outcome; 42 (31%) were based on ‘semiobjective’ outcomes and 69 (50%) on a ‘subjective’ outcome. The type of intervention was pharmacological in 59% of the meta-analyses. There were significant differences in the type of outcome reported by publication type (χ2; 2df=11.15, p=0.004) but not in the type of intervention reported (χ2; 1df=0.54, p=0.46).
The descriptive analysis of the different parameter estimates (CER, RRR and I2) used in the calculation of the OIS showed considerable variation depending on the type of outcome (table 2 and figure 3).
The number of included patients was higher in non-Cochrane reviews for all outcomes analysed (table 2). The CER for ‘all cause mortality’ had the lowest mean value and the distribution differed between outcome types. For RRR, the highest mean value and heterogeneity was observed for ‘subjective’ outcomes (table 2 and figure 3).
Meta-analyses that reached the OIS
Figure 4A presents the estimated OIS for each meta-analysis in the extreme scenario of no heterogeneity. All-cause mortality required the highest OIS for both types of reviews. However, this was only marginally higher than ‘semiobjective’ outcomes. For ‘subjective’ outcomes, OIS estimates are considerably smaller due to higher CERs and RRR. Figure 4B shows the number of meta-analyses that have achieved sample sizes equal or higher to the estimated OIS with more non-Cochrane reviews achieving this estimate (see figure 4C).
Estimation of the OIS based on reported heterogeneity shows that the necessary sample was only reduced for Cochrane SR reporting subjective outcomes (table 3). Further increasing the level of heterogeneity (worst-case scenario: heterogeneity=Q3) did not substantially change the proportion of meta-analyses achieving the OIS.
When using a more stringent estimate for the effect size (5% RRR) for ‘all cause mortality’, none of the identified meta-analyses had achieved the necessary sample size to meet the OIS (0/14 Cochrane and 0/12 non-Cochrane). Box presents five examples of meta-analyses reporting ‘all cause mortality’ as an illustration of SRs where the OIS has been reached and where it has not.
Example of five meta-analyses with the ‘all cause mortality’ as the main outcome that do, or do not, achieve the optimum information size (OIS)
Meta-analysis that meet the OIS
Weng et al (Annals of Internal Medicine)25
This meta-analysis evaluated the use of a non-pharmacological intervention (non-invasive ventilation) to treat patients with acute cardiogenic pulmonary oedema including a total of 1369 patients with a control event rate (CER) of 23%, relative risk reduction (RRR) 27% and 0% heterogeneity. For this systematic review assuming a 0% heterogeneity, the OIS estimated was 1296 patients.
Gastric team (The Journal of the American Medical Association)26
This meta-analysis evaluated the use of adjuvant chemotherapy for resectable gastric cancer including a total of 3781 patients with a CER 69%, RRR 9% and 24% heterogeneity reported by the meta-analysis. For this systematic review assuming a 0% heterogeneity, the OIS estimated was 1828 patients.
NSCLC meta-analysis (The Lancet)27
This meta-analysis evaluated the use of adjuvant chemotherapy in patients with operable non-small cell lung cancer including a total of 8447 patients with a CER 49%, RRR 11% and 1% heterogeneity reported by the meta-analysis. For this systematic review assuming a 0% heterogeneity, the OIS estimated was 2686 patients.
Meta-analysis with a large number of included patients that not meet the OIS
Adam et al (Annals of Internal Medicine)28
This meta-analysis evaluated the use of warfarin versus new oral anticoagulants for the management of atrial fibrillation and venous thromboembolism including a total of 14 143 patients with a CER of 2%, RRR 12% and 0% heterogeneity reported. For this systematic review assuming a 0% heterogeneity, the OIS estimated was 100 562 patients.
Rizos et al (The Journal of the American Medical Association)29
This meta-analysis evaluated the administration of omega-3 fatty acid supplementation and risk of major cardiovascular disease events including a total of 125 410 patients with a CER of 7%, RRR 4% and 1% heterogeneity reported. For this systematic review assuming a 0% heterogeneity, the OIS estimated was 255 912 patients.
Our results show that there is wide variability in the range of values that impact on the OIS calculation: effect size (RRR), heterogeneity (I2) and CER, regardless of source (Cochrane or non-Cochrane). This variability is partially explained by the type of outcome (‘all cause mortality’, ‘semiobjective’ or ‘subjective’) evaluated.
OIS estimates could therefore be obtained from different types of outcomes, as previously proposed by Turner et al9 and Rhodes et al.11 To our knowledge, this is the first time that accounting for the type of outcome in the estimation of the OIS has been proposed. We also found that the type of outcome impacts on the range of heterogeneity observed and was particularly high for ‘subjective’ outcomes. One possible explanation for this is the higher number of smaller randomised controlled trials. Nevertheless, these differences were more marked in Cochrane reviews, while non-Cochrane reviews showed more similar levels of heterogeneity across all types of outcomes. The obtained results show that globally less than half of recent published meta-analysis in high-quality journals achieved the OIS and therefore do not have appropriate statistical power to draw firm conclusions.
As expected, the estimation of the OIS assuming different levels of heterogeneity, and alpha values, showed a strong correlation. Although we used specialist software for the estimation of the OIS (TSA V.0.9), it is possible to estimate this value using any software that allows sample size estimation if the heterogeneity level is assumed to be zero. Incorporation of heterogeneity can be done using a simple adjustment proposed by Wetterslev.5 This author proposes the use of an alternative index named the diversity (D2) statistic as opposed to the I2 factor. However, there is currently no consensus on what measure of heterogeneity to adopt for the OIS.4 14
Published meta-analyses that estimate optimal information size often use one or more statistical assumptions, such as a RRR of 10%, or the median RRR of trials with low risk of bias.15–17 Our analysis shows that the median of the RRR is 20% for all pooled reviews. However, because the distribution of RRR varies by outcome type, in some cases optimal information size is underestimated, while in others it is overestimated.
There are several proposed statistics to define a ‘desirable sample size in terms of numbers of participants across all studies’.4 The OIS as described in this paper involves a relatively simple calculation, which if anything is likely to underestimate the information required to define whether firm evidence has been reached to draw robust conclusions4 14 Therefore, we used this definition of OIS as a measure to estimate what proportion of SRs meet this minimum requirement.
We have focused exclusively on the calculation of a single threshold to define when/if a minimum level of evidence has been collected. However, retrospective analyses of meta-analytical results are more commonly used to inform prospective studies. For example, to determine the size of a new trial to answer definitively a question around efficacy. The use of trial sequential methods has been proposed to identify early signals of effect with monitoring boundaries being defined by frequentist, semi-Bayesian and fully Bayesian methods.4 18 19 Although there is still considerable uncertainty about the estimates and the best method to use, empirical studies have provided examples to suggest these methods could help detect signals early (benefit, harm or futility).8 20 Of note, the identification of the sample size required in a new study or studies will depend on the method used in the meta-analysis.14
Reviews conducted by the Cochrane collaboration are considered to be higher quality21 22 and of greater methodological rigour than meta-analyses published in paper-based journals. Our study only included meta-analyses from the top five medical journals, and therefore our results may not be applicable to other meta-analyses published in other journals. Nevertheless, this would bias our results towards better evidence being evaluated to what is currently being generated. Also, our results do not generalise to network meta-analyses, which is an area of evidence synthesis that has grown rapidly.23 A recently published study demonstrated that substantial variation exists in such network-based meta-analysis,24 and the statistical methodology to estimate the OIS in these meta-analyses is less developed than for traditional meta-analysis, hence our exclusion of these studies.
Implications for researchers and methodologists
This study has shown that the type of outcome when estimating the OIS can be used as a proxy for defining the basic parameters (CER, RRR and I2) required to perform the calculation. Systematic reviewers can use these results to calculate an OIS value for their primary outcome independently of the confidence they have on the specific parameters obtained from their review. Therefore, we encourage reviewers to use the estimation of a sample size as a measure of the likely confidence in their results. Particularly as >50% of the primary outcomes in recent SRs appear to fall below this minimum requirement, pointing out the need for further evidence to reduce uncertainty.
Heterogeneity and effect size impact on the estimation of the OIS. It is however possible to estimate the OIS using traditional sample size estimation software and if necessary adjust for heterogeneity. Our results demonstrate that the type of outcome is relevant to the estimation of the OIS, as well as the heterogeneity and the CER and RRR. Currently less than half of published meta-analysis in high-quality journals have achieved the OIS, and therefore conclusions based on such results are subject to substantial uncertainty.
The research was supported by the NIHR Oxford Biomedical Research Centre. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. We wish to thank Pol Escarpenter for his support to adapt the figures to the requested format of the journal.
Contributors JMG-A and RP conceived and designed the study. JMG-A and NP extracted the data. JMG-A and RP analysed the data. JMG-A, RP, CB and CH interpreted the data. JMG-A and RP wrote the first draft. All authors contributed to the writing of the manuscript, revised the intellectual content and approved the version to be published.
Competing interests None declared.
Patient consent None.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement All of the data used in this research are provided within this publication, its appendices and the publications referenced in the online supplementary appendices.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.