Inflation of the type I error rate when a continuous confounding variable is categorized in logistic regression analyses

Stat Med. 2004 Apr 15;23(7):1159-78. doi: 10.1002/sim.1687.

Abstract

This paper demonstrates an inflation of the type I error rate that occurs when testing the statistical significance of a continuous risk factor after adjusting for a correlated continuous confounding variable that has been divided into a categorical variable. We used Monte Carlo simulation methods to assess the inflation of the type I error rate when testing the statistical significance of a risk factor after adjusting for a continuous confounding variable that has been divided into categories. We found that the inflation of the type I error rate increases with increasing sample size, as the correlation between the risk factor and the confounding variable increases, and with a decrease in the number of categories into which the confounder is divided. Even when the confounder is divided in a five-level categorical variable, the inflation of the type I error rate remained high when both the sample size and the correlation between the risk factor and the confounder were high.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cohort Studies
  • Computer Simulation
  • Confounding Factors, Epidemiologic*
  • Data Interpretation, Statistical*
  • Humans
  • Logistic Models*
  • Monte Carlo Method
  • Risk Factors