A comparison of multivariable regression models to analyse cost data

J Eval Clin Pract. 2006 Feb;12(1):76-86. doi: 10.1111/j.1365-2753.2006.00610.x.

Abstract

Rationale, aims and objectives: Analysis of cost data is important in providing reliable information to aid budgeting decisions. Certain features of cost data, such as its typically highly skewed distribution and the need to estimate arithmetic mean costs in order to allow inferences to be made on total costs, make it difficult to analyse. Multivariable regression analysis is useful for estimating the influence of explanatory variables on cost in order to predict costs of future patients and allows for the control of variables which influence cost but whose distributions differ between comparison groups. This is especially important in the case of observational studies, where there may be no control over the balance of characteristics between the comparison groups.

Method: This paper compares the appropriateness of various multivariable models of cost data by examining regression diagnostics, using as an example data collected on costs incurred in the treatment of inflammatory bowel disease. The models compared are normal and bootstrapped multiple linear regression, median regression, gamma model with the log link and normal linear regression (NLR) of log costs.

Results: Gamma modelling with the log link was found to be the most suitable model.

Conclusions: Bootstrapping was found to make very little difference to conclusions from the NLR model.

Publication types

  • Comparative Study

MeSH terms

  • Health Care Costs / statistics & numerical data*
  • Humans
  • Inflammatory Bowel Diseases / economics
  • Multivariate Analysis*
  • Regression Analysis*