The statistical significance of randomized controlled trial results is frequently fragile: a case for a Fragility Index

J Clin Epidemiol. 2014 Jun;67(6):622-8. doi: 10.1016/j.jclinepi.2013.10.019. Epub 2014 Feb 5.

Abstract

Objectives: A P-value <0.05 is one metric used to evaluate the results of a randomized controlled trial (RCT). We wondered how often statistically significant results in RCTs may be lost with small changes in the numbers of outcomes.

Study design and setting: A review of RCTs in high-impact medical journals that reported a statistically significant result for at least one dichotomous or time-to-event outcome in the abstract. In the group with the smallest number of events, we changed the status of patients without an event to an event until the P-value exceeded 0.05. We labeled this number the Fragility Index; smaller numbers indicated a more fragile result.

Results: The 399 eligible trials had a median sample size of 682 patients (range: 15-112,604) and a median of 112 events (range: 8-5,142); 53% reported a P-value <0.01. The median Fragility Index was 8 (range: 0-109); 25% had a Fragility Index of 3 or less. In 53% of trials, the Fragility Index was less than the number of patients lost to follow-up.

Conclusion: The statistically significant results of many RCTs hinge on small numbers of events. The Fragility Index complements the P-value and helps identify less robust results.

Keywords: Lost to follow-up; Randomized controlled trials; Research methodology.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Data Interpretation, Statistical*
  • Humans
  • Lost to Follow-Up
  • Randomized Controlled Trials as Topic*
  • Sample Size