Modeling length of stay in hospital and other right skewed data: comparison of phase-type, gamma and log-normal distributions

Value Health. 2009 Mar-Apr;12(2):309-14. doi: 10.1111/j.1524-4733.2008.00421.x.

Abstract

Objectives: To present a relatively novel method for modeling length-of-stay data and assess the role of covariates, some of which are related to adverse events. To undertake critical comparisons with alternative models based on the gamma and log-normal distributions. To demonstrate the effect of poorly fitting models on decision-making.

Methods: The model has the process of hospital stay organized into Markov phases/states that describe stay in hospital before discharge to an absorbing state. Admission is via state 1 and discharge from this first state would correspond to a short stay, with transitions to later states corresponding to longer stays. The resulting phase-type probability distributions provide a flexible modeling framework for length-of-stay data which are known to be awkward and difficult to fit to other distributions.

Results: The dataset consisted of 1901 patients' lengths of stay and values for a number of covariates. The fitted model comprised six Markov phases, and provided a good fit to the data. Alternative gamma and log-normal models did not fit as well, gave different coefficient estimates, and statistical significance of covariate effects differed between the models.

Conclusions: Models that fit should generally be preferred over those that do not, as they will produce more statistically reliable coefficient estimates. Poor coefficient estimates may mislead decision-makers by either understating or overstating the cost of some event or the cost savings from preventing that event. There is no obvious way of identifying a priori when coefficient estimates from poorly fitting models might be misleading.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Decision Making
  • Humans
  • Length of Stay / statistics & numerical data*
  • Markov Chains
  • Models, Statistical*
  • Multivariate Analysis
  • Normal Distribution
  • Patient Discharge / statistics & numerical data
  • Queensland
  • Statistical Distributions*
  • Statistics as Topic