Original ArticleRelative risks and confidence intervals were easily computed indirectly from multivariable logistic regression
Introduction
Relative risk (RR) is a common measure of the effect of treatment or exposure on outcome in cohort studies. Estimating this simple ratio of the disease risk among the treated (or exposed) compared to the untreated (or unexposed), and an appropriate confidence interval, is a routine application of Mantel–Haenszel methods [1], provided the investigator needs to adjust for only one or two categorical factors. More commonly, however, the study calls for simultaneous adjustment of several factors, some of which are continuous, via multivariable regression modeling. As described in texts on biomedical statistics [2], logistic regression for binary outcome data produces an adjusted odds ratio (OR), not a relative risk. Although the OR has attractive mathematical properties, clinicians rarely think in terms of odds of disease or the OR as a measure of effect [3], [4]. If the risk of an outcome event is rare, under 10%, and the OR is small, the OR approximates the relative risk. But with more common outcomes, the OR is well-known to be more extreme (farther from 1.0) than the relative risk for the same data [5], [6]. The controversy generated by the report of Schulman et al. report on the effects of race and sex on physician referrals exemplifies this distortion [7], [8].
Some authors [9] have converted ORs to relative risks by the simple relationship RR = OR/([1 − p0] + [p0 × OR]), where the OR came from the estimate of the logistic regression model, while the value of the baseline risk (p0) was estimated as the unadjusted risk in the reference group (in that case a hospital). The authors estimated the upper and lower bounds for the confidence interval by substituting for OR the upper and lower confidence bounds for the OR from the logistic regression. This method of estimating a confidence interval, known as the “method of substitution”, has been applied to other measures of association [10]. Subsequent criticism suggested, however, that the proposed confidence interval for relative risk would be too narrow because of its failure to account for variability in the baseline risk [11], [12]. Others arrived at the same conclusion independently in similar contexts [13].
Estimating relative risk is also possible by means of alternative generalized linear models.
One proposed option, the log-binomial model, replaces the logit link in logistic regression with a log link but maintains the specification of a binomial distribution [12], [14]. Although this functional form estimates relative risk directly by simple exponentiation of the regression coefficient for the exposure of interest, the log link permits estimates of risk within the broader bounds [0,∞] when probabilities must fall within the bounds [0,1]. Because of this mismatch between the bounds of the model and the allowable outcome, Wacholder [15] proposed constraining the fitting algorithm to respect the [0,1] bounds. His algorithm has been incorporated into the Stata statistical package in the function “binreg” (Stata Corp., College Station, TX, 2001). Owing to the known problem of convergence with the log-binomial model, several authors [16], [17], [18] recently proposed Poisson regression, a generalized linear model with a log-link and a Poisson distribution, and the sandwich variance estimator to produce confidence intervals with correct coverage. The issue of expected risks exceeding the [0,1] bounds remains, however. Greenland recently reviewed these recent articles in the broader context of the literature on standardization of estimates [19].
Inspite of the shortcoming of simplistic methods, they continue to appear in the literature. As of October 2005, almost 200 articles have cited and used the method of substitution outlined in 1998 by Zhang and Yu [9]. These applications, including those published in major medical journals [20], [21], involved common outcomes. The log-linear model with sandwich variance estimates outlined by Zou in 2004 has also begun to gain use, again in leading medical journals [22] for applications with common outcomes. Nevertheless, both methods, as we shall point out, suffer from theoretical as well as methodological problems.
We first demonstrate that confidence intervals generated from logistic regression and the method of substitution, at least as promulgated by Zhang and Yu, exhibit poor coverage for the intended applications of common outcomes. Then we explain why the log-binomial and the log-linear (Poisson) models options might also fail when outcomes are common. Finally, we build upon methodological literature on conditional and marginal standardization to demonstrate several options for using logistic regression to estimate relative risk.
Section snippets
Simulations
We simulated data sets with known values for baseline risks (0.1, 0.2, and 0.3) and relative risk (1.25, 1.5, 1.75, and 2.0), and with 100, 500, and 1000 hypothetical patients split equally into two hypothetical groups, unexposed and exposed. Additional simulations assumed two unbalanced data sets: 100 unexposed and 400 exposed patients, or 100 exposed and 400 unexposed patients. The program next simulated the occurrence of disease at an expected rate among the unexposed (untreated) patients
Method of substitution
In our simulations, the method of substitution advocated by Zhang and Yu generally produced inappropriately narrow 95% confidence intervals for relative risk (Table 1). Even for a low baseline risk (0.2) and a modest relative risk (2.0), the confidence intervals intended to have 95% coverage actually produced less than 90% coverage. For a given relative risk, coverage of the confidence intervals worsened as the baseline risk (p0) increased. For a given baseline risk, coverage deteriorated with
Discussion
Confidence intervals are essential to support estimates of relative risk from multivariable regression models [32], [33]. Our simulations demonstrate why the method of substitution outlined by Zhang and Yu [9] and finding common use in leading journals fails in the very situations for which it was designed—when baseline risk and relative risk are not small. Confidence intervals are too narrow and therefore precision of estimates is overstated. By contrast, confidence intervals based on either
Acknowledgment
Funding: Support was provided in part by an Agency for Healthcare Research and Quality (AHRQ), Centers for Education and Research on Therapeutics cooperative agreement (U18 HS10399), and by Agency for Healthcare Research and Quality, Grant No. R03 HS 11481-01.
Competing interests: Dr. Berlin is employed by Johnson & Johnson, which markets products for treatment of wounds. Johnson & Johnson has provided no input to or support for this study.
References (41)
- et al.
Clinically useful measures of effect in binary analyses of randomized trials
J Clin Epidemiol
(1994) - et al.
What's the relative risk? A method to directly estimate risk ratios in cohort studies of common outcomes
Ann Epidemiol
(2002) - et al.
Large sample confidence intervals for regression standardized risks, risk ratios, and risk differences
J Chronic Dis
(1987) - et al.
Estimation of a common effect parameter from sparse follow-up data
Biometrics
(1985) Practical statistics for medical research
(1991)- et al.
The odds ratio
BMJ
(2000) When can odds ratio mislead? [letter]
BMJ
(1998)- et al.
Odds ratios should be avoided when events are common
BMJ
(1998) - et al.
The effect of race and sex on physicians' recommendations for cardiac catheterization
N Engl J Med
(1999) race, sex, and physicians' referral for cardiac catheterization [letter]
N Engl J Med
(1999)
What's the relative risk? A method of correcting the odds ratio in cohort studies of common outcomes
JAMA
Confidence limits made easy: interval estimation using a substitution method
Am J Epidemiol
Correcting the odds ratio in cohort studies of common outcomes
JAMA
Estimating the relative risk in cohort studies and clinical trials of common outcomes
Am J Epidemiol
Expressing the magnitude of adverse effects in case–control studies: “the number of patients needed to be treated for one additional patient to be harmed”
BMJ
Binomial regression in GLIM: estimating risk ratios and risk differences
Am J Epidemiol
A modified Poisson regression approach to prospective studies with binary data
Am J Epidemiol
Quasi-likelihood estimation for relative risk regression models
Biostatistics
Easy SAS calculations for risk and prevalence ratios and differences
Am J Epidemiol
Model-based estimation of relative risks and other epidemiologic measures in studies of common outcomes and in case–control studies
Am J Epidemiol
Cited by (207)
Stroke and activity limitation in Chinese adults 65 Years or older
2023, Disability and Health JournalTechnology and the geography of the foreign exchange market
2023, Journal of International Money and FinanceLow-dose aspirin use in pregnancy and the risk of preterm birth: a Swedish register-based cohort study
2023, American Journal of Obstetrics and GynecologyImpact of Medicaid expansion on young adult firearm and motor vehicle crash trauma patients
2022, Surgery Open Science