Collinearity diagnosis for a relative risk regression analysis: an application to assessment of diet-cancer relationship in epidemiological studies

Stat Med. 1992 Jul;11(10):1273-87. doi: 10.1002/sim.4780111003.

Abstract

In epidemiologic studies, two forms of collinear relationships between the intake of major nutrients, high correlations, and the relative homogeneity of the diet, can yield unstable and not easily interpreted regression estimates for the effect of diet on disease risk. This paper presents tools for assessing the magnitude and source of the corresponding collinear relationships among the estimated coefficients for relative risk regression models. I show how to extend three tools (condition indices, variance decomposition proportions, and standard inflation factors) for diagnosing collinearity in standard regression models to likelihood and partial likelihood estimation for logistic and proportional hazards models. This extension is based on the analogue role of the information matrix in such analyses and the cross-product matrix in the standard linear model. I apply the methodology to relative risk models that relate crude intakes (on the log scale) and nutrient densities to breast cancer cases in the NHANES-I follow-up study. The three diagnostic tools provide complementary evidence of the existence of a strong collinearity in all models that is due largely to homogeneity of the population with respect to our risk scale for the crude intakes. The analysis suggests that the non-significant relative risks for the crude intakes in these models may be due to their involvement in collinear relationships, while the nonsignificant relative risks for the nutrient densities are far less affected by multicollinearity.

MeSH terms

  • Adult
  • Breast Neoplasms / etiology*
  • Diet*
  • Female
  • Follow-Up Studies
  • Humans
  • Logistic Models
  • Prospective Studies
  • Regression Analysis*
  • Risk*