Tutorial in biostatistics: data-driven subgroup identification and analysis in clinical trials

Ilya Lipkovich; Alex Dmitrienko; Ralph B D'Agostino Sr

doi:10.1002/sim.7064

Tutorial in biostatistics: data-driven subgroup identification and analysis in clinical trials

Stat Med. 2017 Jan 15;36(1):136-196. doi: 10.1002/sim.7064. Epub 2016 Aug 3.

Authors

Ilya Lipkovich¹, Alex Dmitrienko², Ralph B D'Agostino Sr³

Affiliations

¹ Quintiles, Inc., Durham, NC, U.S.A.
² Mediana, Inc., Overland Park, KS, U.S.A.
³ Boston University, Boston, MA, U.S.A.

PMID: 27488683
DOI: 10.1002/sim.7064

Abstract

It is well known that both the direction and magnitude of the treatment effect in clinical trials are often affected by baseline patient characteristics (generally referred to as biomarkers). Characterization of treatment effect heterogeneity plays a central role in the field of personalized medicine and facilitates the development of tailored therapies. This tutorial focuses on a general class of problems arising in data-driven subgroup analysis, namely, identification of biomarkers with strong predictive properties and patient subgroups with desirable characteristics such as improved benefit and/or safety. Limitations of ad-hoc approaches to biomarker exploration and subgroup identification in clinical trials are discussed, and the ad-hoc approaches are contrasted with principled approaches to exploratory subgroup analysis based on recent advances in machine learning and data mining. A general framework for evaluating predictive biomarkers and identification of associated subgroups is introduced. The tutorial provides a review of a broad class of statistical methods used in subgroup discovery, including global outcome modeling methods, global treatment effect modeling methods, optimal treatment regimes, and local modeling methods. Commonly used subgroup identification methods are illustrated using two case studies based on clinical trials with binary and survival endpoints. Copyright © 2016 John Wiley & Sons, Ltd.

Keywords: biomarker analysis; clinical trials; data mining; exploratory subgroup analysis; multiplicity control.

MeSH terms

Biomarkers / analysis*
Biostatistics*
Clinical Trials as Topic / statistics & numerical data*
Data Mining
Humans
Precision Medicine
Research Design*

Substances

Biomarkers