Simplifying a prognostic model: a simulation study based on clinical data

Gareth Ambler; Anthony R Brady; Patrick Royston

doi:10.1002/sim.1422

Simplifying a prognostic model: a simulation study based on clinical data

Stat Med. 2002 Dec 30;21(24):3803-22. doi: 10.1002/sim.1422.

Authors

Gareth Ambler¹, Anthony R Brady, Patrick Royston

Affiliation

¹ Department of Statistical Science, University College, 1-19 Torrington Place, London WC1E 7HB, UK. g.ambler@ucl.ac.uk

PMID: 12483768
DOI: 10.1002/sim.1422

Abstract

Prognostic models are designed to predict a clinical outcome in individuals or groups of individuals with a particular disease or condition. To avoid bias many researchers advocate the use of full models developed by prespecifying predictors. Variable selection is not employed and the resulting models may be large and complicated. In practice more parsimonious models that retain most of the prognostic information may be preferred. We investigate the effect on various performance measures, including mean square error and prognostic classification, of three methods for estimating full models (including penalized estimation and Tibshirani's lasso) and consider two methods (backwards elimination and a new proposal called stepdown) for simplifying full models. Simulation studies based on two medical data sets suggest that simplified models can be found that perform nearly as well as, or sometimes even better than, full models. Optimizing the Akaike information criterion appears to be appropriate for choosing the degree of simplification.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Aortic Aneurysm, Abdominal / mortality
Aortic Aneurysm, Abdominal / surgery
Breast Neoplasms / mortality
Disease-Free Survival
Female
Humans
Likelihood Functions*
Models, Biological*
Risk Factors
Survival Analysis