More information about text formats
There is considerable debate going on questioning the practical usefulness of a priori power calculations suggesting that “underpowered” studies are not unethical and that little scientific projection would be still better than no projection at all [1-4]. Some authors argue that “being underpowered is unethical” is a “widespread misconception which is only plausible when presented in vague, qualitative terms but does not hold when examined in detail” [1, 2]. Further review of the arguments reveals that the crucial assumptions implied in the reasoning do not reflect actual scientific practice. The main theoretical arguments assume a perfect “frequentist world” that may allow substitution of one big trial by a corresponding number of small trials that would, once being aggregated in a formal evidence synthesis i.e. meta-analysis, cumulate the same information as the big one [2, 4]. If the individual studies are non-representative samples of the target population, the practical value of estimating a pooled effect that is a weighted average of potentially disparate effects in different subpopulations is questionable.
A widely considered answer to the threat of effect heterogeneity in meta-analyses are random-effect confidence intervals that are often assumed to better reflect variation in the effects across subpopulations than fixed-effects confidence intervals. However, while such intervals offer a valid solution to inference regarding the average effect across all c...
A widely considered answer to the threat of effect heterogeneity in meta-analyses are random-effect confidence intervals that are often assumed to better reflect variation in the effects across subpopulations than fixed-effects confidence intervals. However, while such intervals offer a valid solution to inference regarding the average effect across all contributing effects, they continue to suffer from the principal limitations of effect estimates that are based on non-representative samples: the location and width of these confidence intervals will ultimately depend on the representation of subpopulations and therefore on the selection mechanisms inherent to the data.
While these are well-known and widely debated limitations of most sample-based research studies, another fundamental interpretational issue applies to confidence intervals: they refer to the mean (or average) effect across subpopulations. In the context of meta-analyses, with a large enough number of studies, either random or fixed effects confidence intervals will not cover the actual range of observed study-specific effect estimates. In other words, the intervals are providing a precise estimate of a parameter that actually does not exist, as it represents a weighted average of an underlying set of parameters in homogenous subpopulations.
In a recent plea for routinely presenting prediction intervals in meta-analysis , InHout et al. promote reporting prediction intervals, in addition to confidence intervals . Prediction intervals reflect the variation in treatment or exposure effects over different settings, and allow to infer on what effect is to be expected in future individuals, such as a patient that a clinician is interested to treat. In contrast to confidence intervals, prediction intervals do not shrink to zero width if the sample size largely increases but cover a prespecified range of expected effects in the underlying population. The authors conclude that prediction intervals should be routinely reported to allow for more informative inferences in meta-analyses.
We suggest that prediction intervals are not only meaningful in the context of meta-analyses, but, as implied by the generally applicable concept of variance decomposition , may be in a very similar way relevant to reporting individual studies or trials.
The interest in subgroup analyses in individual studies is often not properly addressed at the analysis stage due to a general claim of “lack of power” that would arise in stratified analyses or modelling approaches including interaction terms. As a result, a single point estimate is often reported along with a confidence intervals that implies homogeneity of the effect across all known subgroups. Such subgroups do, in our point of view, constitute subpopulations similar to subpopulations (studies) in meta-analyses. We therefore question, why should we not consider reporting prediction intervals for single study effect estimates based on pre-specified subgroups such as strata used for randomization or purposive sampling in the context of clinical trials?
1. Bacchetti P, McCulloch CE, Segal MR. Being ‘underpowered’ does not make a study unethical. . Statistics in Medicine 2011; 30:2785–2792.
2. Bacchetti P, Wolf LE, Segal MR, McCulloch CE. Ethics and sample size. American Journal of Epidemiology 2005; 161:105–110.
3. Bacchetti P, McCulloch CE, Segal MR. Simple, defensible sample sizes based on cost efficiency. Biometrics 2008; 64:577–585.
4. Edwards SJL, Lilford RJ, Braunholtz D, Jackson J. Why “underpowered” trials are not necessarily unethical. Lancet 1997; 350:804–807.
5. IntHout J, Ioannidis JP, Rovers MM, Goeman JJ. Plea for routinely presenting prediction intervals in meta-analysis. BMJ open. 2016 Jul 1;6(7):e010247.
6. Weiss NA. A course in probability. Addison-Wesley; 2006.