Introduction

The term “confounding by indication” was first introduced into the epidemiologic literature over 30 years ago [1], and its usage has become widespread in recent years. Unfortunately, the original concept associated with the term has been abandoned, and “confounding by indication” is sometimes used synonymously with selection bias, protopathic bias (i.e., when treatment for an early symptom of a disease appears to cause the disease), and confounding by disease severity [2]. In the contemporary literature, confounding by indication is even equated with confounding in general and reverse causality [3]. This change in meaning has led to diminished recognition of the phenomenon originally described, and is also partly responsible for the mistaken belief that confounding by indication can be routinely addressed in non-experimental studies using newer and sophisticated statistical methods. Confounding by indication is also sometimes confused with confounding by contraindication.

In this paper, we review concepts related to confounding by indication in studies of intended effects, and confounding by indication and confounding by extraneous aspects of the indication in studies of unintended effects. We also discuss how these concepts differ from confounding by contraindication in studies of unintended effects. We argue for greater conceptual and semantic clarity with regard to confounding by indication in order to distinguish non-experimental studies where this bias may be controlled versus non-experimental studies where such control may not be feasible. Various non-experimental methods that have been proposed to address confounding by indication are also briefly discussed.

Confounding by Indication in Studies of Intended Effects

Quantifying the intended effects of therapies or other interventions in the non-experimental setting can present a serious challenge [1]. Exceptions arise when interventions have strong and immediate effects, such as the effect of intravenous glucose for hypoglycemic coma or naloxone for the reversal of opioid effects. Drugs with large and immediate effects have occasionally created ethical dilemmas because contemporary drug approval requires evidence from randomized trials [4]. The Pittsburgh group, who had witnessed the dramatic effect of tacrolimus among patients with terminal graft rejection, declined participation in the randomized trials required for drug approval [5]. Non-experimental assessment of the effects of interventions that have a dichotomous (yes/no) indication is also feasible since the indication is singular and not graded. For instance, postpartum Rh immune globulin for preventing Rh sensitization has a dichotomous indication: all Rh-negative mothers with Rh-positive infants require this treatment to prevent Rh sensitization. Confounding by indication is not an issue when contrasting rates of Rh sensitization among Rh-negative women with Rh-positive infants who received treatment versus those who did not, as both groups have the same indication [1].

More commonly, however, the indication for the intervention is graded and subtle and is based on physician and patient perceptions of disease severity and prognosis, including the presumed therapeutic effect of the intervention. For instance, the indication for anti-hypertensive therapy to prevent stroke is based on a complex patient-physician calculus that incorporates family history of hypertension and cardiovascular disease, symptoms such as headache, and mitigating and exacerbating lifestyle factors including stress, diet and exercise, in addition to the blood pressure recorded on one or more occasions [1]. Inevitably, hypertensive subjects with a higher intensity of the indication are much more likely to take anti-hypertensive treatment compared with hypertensive subjects with a lower intensity of the indication. A non-experimental contrast of stroke rates among hypertensive subjects taking anti-hypertensive medication with hypertensive subjects not taking such medication would lead to confounding by indication, as the a priori risk of stroke will be higher among those who receive anti-hypertensive therapy.

Confounding by indication is defined as a bias in the treatment-intended outcome relationship due to the clinical reasons for the treatment, with the indication for treatment based on physician and patient perceptions of disease severity and prognosis, including the presumed therapeutic effect of the intervention. Confounding occurs because the treatment and the indication are closely correlated, and a greater intensity of the indication presages a higher rate of the undesired outcome. Such confounding is typically negative and serves to dilute or reverse the effect of the treatment [1]. An illustrative example of confounding by indication is seen in Table 1, which summarizes the rates of maternal death by mode of delivery among women at ≥24 weeks’ gestation [6]. The question addressed by this analysis is the effect of cesarean (versus vaginal) delivery on maternal death, with prevention of maternal death being an intended effect of cesarean delivery. Cesarean deliveries are carried out for diverse maternal and fetal indications, including several that present a threat to the life of the mother (e.g., placenta previa, obstructed labor, and severe preeclampsia). The rate of maternal death following cesarean delivery was 17.2/100,000 maternities, compared with a maternal death rate of 4.8/100,000 maternities following vaginal delivery. Although the observed rates of maternal death suggest that cesarean deliveries increase the risk of maternal death, this association is confounded by indication. The Confidential Enquiry into Maternal Deaths in the United Kingdom [6], which reviewed each of these deaths in detail, found that in most cases of maternal death following cesarean delivery, the cesarean was necessitated by a serious pregnancy complication. An example vignette presented in the report of the Confidential Enquiry [6] highlights this issue and shows the complexity of the indication for cesarean delivery.

Table 1 Estimated numbers and rates of maternal deaths and maternities by type of delivery, United Kingdom 2000–2002 [6]

“A woman, whose blood pressure was 110/60 mm/Hg in early pregnancy, was admitted in late pregnancy with a diastolic pressure of 92 mm/Hg and proteinuria +++. Over the subsequent days, blood pressures of 155/95 mm Hg and 145/100 mm Hg were noted and a 24-hour urine collection showed greater than 4 g of protein………she complained of epigastric pain ……….After her blood pressure rose to 220/120 mm/Hg, antihypertensive treatment was started for the first time (intravenous labetalol). Her blood pressure remained elevated at 215/120 mm Hg and she remained symptomatic. A caesarean section was performed. There was continuing poor control of her blood pressure after delivery…A computed tomography (CT) scan showed a massive intracranial haemorrhage. She was transferred to a neurosurgical unit but died despite craniotomy. Laboratory tests showed HELLP syndrome.” [6]

In this instance, the indication for the cesarean delivery (namely, severe preeclampsia/HELLP syndrome) was instrumental in causing both the cesarean delivery and the maternal death. Attempting to address confounding of the cesarean delivery-maternal death association by controlling for the level of hypertension in the cesarean and vaginal delivery groups is futile; in fact, it is “almost impossible to disentangle (these) consequences of caesarean section from the indication for the operation” [6]. It is noteworthy that the hypertension was progressive, with varying anti-hypertensive treatment effects at different points. Also, a full quantification of the indication would have to incorporate assessments of the symptoms such as epigastric pain, the degree of proteinuria, and the fetal condition. For this reason, confounding by indication has been deemed “all-pervasive” and intractable, with non-experimental control “commonly infeasible owing to the complexity and subtlety of the indication” [1]. The only definitive way to address such confounding by indication involving therapies for subtle and complex indications is by ensuring that the index and reference groups are balanced with regard to the indication, as in a large randomized trial. Ethical issues, however, may preclude experimentation.

Confounding by indication also plagues non-experimental attempts to determine the efficacy of labor induction for pregnancy complications such as preeclampsia and fetal growth restriction in preventing perinatal death. Even after excluding cases of labor induction for antepartum stillbirth (i.e., eliminating reverse causality), the severity of pregnancy complications that underlie the indication for labor induction would result in the index (labor induction) group having a higher a priori risk of perinatal death compared with the reference (no labor induction) group. Women who did not have labor induction would have a relatively lower risk of perinatal death because their preeclampsia or growth restriction was relatively mild.

Confounding by Indication in Studies of Unintended Effects

Indications for treatment tend to be strongly associated with intended outcomes (e.g., stroke rates are substantially higher among those with an indication for anti-hypertensive therapy) but not with unintended outcomes (e.g., hypertension is not associated with persistent dry cough, an unintended side effect of specific anti-hypertensives, namely ACE inhibitors). Similarly, rates of penicillin anaphylaxis are not higher among people with more severe infection, and drug-induced rash and fixed drug eruptions (due to allopurinol, barbiturates, and sulfonamides) are not associated with the underlying indications for these drugs. Since the indication is typically not a risk factor for the unintended effect, studies of unintended effects are not likely to be confounded by indication.

Occasionally, however, an association may exist between the indication and the unintended effect. A recent study on calcium channel blocker use (as an anti-hypertensive) in pregnancy and postpartum hemorrhage was based on the premise that the tocolytic properties of these drugs may potentially increase rates of atonic postpartum hemorrhage [7]. Even though postpartum hemorrhage is an unintended effect, hypertension (the indication for calcium channel blocker use) could potentially confound the calcium channel blocker-postpartum hemorrhage association because it is a risk factor for postpartum hemorrhage. However, the association between hypertension and postpartum hemorrhage is modest (the rate ratio for postpartum hemorrhage given hypertension is about 1.5 [810]), and controlling for such confounding should be feasible using standard methods (see below).

Confounding by Extraneous Aspects of the Indication

Another variant of confounding by indication is observed in studies of unintended effects, namely, confounding by extraneous aspects of the indication. Recent studies [11, 12] have examined whether selective serotonin reuptake inhibitor (SSRI) use in pregnancy increases the risk of stillbirth. SSRI use in pregnancy is typically indicated for depression or anxiety, and stillbirth prevention is not the intended effect of the therapy. Similarly, stillbirth occurrence is not known to be an unintended toxic or other side effect of SSRIs, and high risk for stillbirth is not considered a contraindication for SSRI use. However, studies on the effect of SSRI use on stillbirth do have the potential to be confounded by extraneous lifestyle and behavioral factors associated with the indication (if they are not on the causal pathway between SSRI use and stillbirth). Women with depression or anxiety in pregnancy are likely to have a differential distribution of various risk factors for stillbirth, including smoking, low socioeconomic status, and inadequate antenatal care. Such confounding by extraneous aspects of the indication can be addressed through standard epidemiologic design and analysis methods, with control of socioeconomic status facilitated through careful selection of an appropriate control group [13].

Non-experimental assessment of the relationship between cesarean delivery and subsequent subfertility also represents a situation where the potential for confounding by extraneous aspects of the indication exists [14]. Subfertility, often measured by time to pregnancy or the interpregnancy interval, does not constitute an intended or known unintended effect of cesarean delivery. However, indications for cesarean delivery are typically associated with various factors that influence fertility, including maternal age, height, body mass index, obstetric history, and socioeconomic status [14]. Control of confounding due to such extraneous aspects of the indication – including constitutional, behavioral, and lifestyle factors – should be feasible through standard epidemiologic means; again, however, care with methods design may be needed for appropriately controlling confounding by socioeconomic status [13].

Confounding by Contraindication in Studies of Unintended Effects

Whereas confounding by indication is a serious bias affecting non-experimental studies of efficacy (where the magnitude of the intended effect is at issue), confounding by contraindication represents a relatively less serious bias that can occur in non-experimental studies that examine unintended effects such as known toxic or known side effects [1]. Contraindications to treatment can represent conditions that are typically predictive of known side effects. For instance, aspirin use (for analgesic or other purposes) is contraindicated in people with a previous gastrointestinal bleed because such bleeding is a known side effect of aspirin use. A non-experimental study attempting to quantify the relationship between aspirin use and gastrointestinal bleeding could be biased because the index group of aspirin users would not include people with previous gastrointestinal bleeds. Ignoring such confounding by contraindication would result in the reference group of aspirin non-users appearing to have an artificially high rate of gastrointestinal bleeding, given the inclusion of people with previous gastrointestinal bleeding. However, such confounding by contraindication presents a relatively less serious methodological problem, as it can be addressed by restricting the reference group (of aspirin non-users) to subjects without a previous gastrointestinal bleed [1].

Although, as previously mentioned, non-experimental studies assessing the effect of labor induction on perinatal death would be confounded by indication, this would not be the case with regard to non-experimental studies assessing the effect of labor induction on adverse maternal outcomes such as postpartum hemorrhage [15]. On the other hand, a study attempting to quantify the effect of labor induction on postpartum hemorrhage could be confounded by contraindication, as postpartum hemorrhage would be an unintended effect of labor induction. However, this bias would occur only if obstetricians recognized labor induction as a risk factor for postpartum hemorrhage and considered women at high risk of postpartum hemorrhage (e.g., those with a previous history of postpartum hemorrhage) as having a contraindication for labor induction. Restricting the study to nulliparous women or to women without a previous history of postpartum hemorrhage would eliminate potential bias.

Confusion Between Confounding by Indication and Confounding by Contraindication

A recent study on in vitro fertilization [16] examined the effect of blastocyst-stage (day 5/6) embryo transfer versus cleavage-stage (day 3) embryo transfer on the occurrence of preterm birth. Day 5/6 transfer optimizes embryo selection and is favored in cases requiring single-embryo transfer in order to avoid multi-fetal pregnancy. The study showed a preterm birth rate of 17.2 % in the day 5/6 embryo transfer group and a preterm birth rate of 14.1 % in the day 3 embryo transfer group (adjusted odds ratio 1.32, 95 % confidence interval, 1.17–1.49). This study was criticized because it “…suffered from a serious methodologic flaw, namely ‘confounding by indication’.” [17]. According to critics, the bias arose because women who have a contraindication to multi-fetal pregnancy (which carries a high risk for preterm delivery) due to a prior preterm delivery, cervical incompetence, uterine anomaly, or other medical complication almost universally receive an elective single-embryo transfer. The differentially higher rate of single-embryo transfers in the day 5/6 group (20.2 %) compared with the day 3 group (5.8 %) suggests that a significant fraction of the women in the day 5/6 group had a contraindication to multiple-embryo transfer and were at higher risk of preterm birth even before the embryo transfer. In their response [18], the authors of the original study conceded that the criticism was valid and reanalyzed their data after excluding women who had a prior preterm delivery or cervical incompetence.

The controversy around the effects of day 3 versus day 5/6 embryo transfer notwithstanding, this discussion illustrates the imprecise use of the term “confounding by indication” in the literature. At issue was confounding by contraindication, as day 3 embryo transfer, which can result in multi-fetal pregnancies, is contraindicated among women at high risk for preterm delivery. Such confounding by contraindication can be addressed by restricting the study to a domain free of the contraindication – namely, to nulliparous women, those without cervical incompetence, those without a previous history of preterm birth, etc. Although the discussion between the authors and their critics showed clarity with regard to issues of bias and the means for rectification, there was semantic confusion with regard to the terms and concepts related to confounding by indication and confounding by contraindication [17, 18].

Non-Experimental Methods for Addressing Confounding by Indication

Whether non-experimental studies can, in fact, provide valid assessments of the effects of intervention has been a longstanding source of controversy [1, 19•], with residual confounding the issue that plagues most such methods of dealing with confounding by indication. As mentioned previously, confounding by indication in non-experimental studies of intended effects is less challenging when effects are large and immediate or when the indication is dichotomous. This is also true for confounding by indication when the effect at issue is unintended and when the confounding by indication involves extraneous aspects of the indication. The literature on non-experimental methods to address confounding by indication includes various design and analysis strategies, and the degree of success achieved by these techniques depends, at least partly, on whether the confounding by indication is related to intended or unintended effects and whether the indication is strong, complex, and unquantifiable. Non-experimental methods of dealing with confounding by indication include:

Restricting the Study to a Domain Free of the Indication

Confounding by indication affects non-experimental studies contrasting perinatal death rates among home versus hospital deliveries and among deliveries by midwives versus obstetricians. In most industrialized countries with universal access to health care, such as Canada and the United Kingdom, pre-existing illnesses and pregnancy complications that arise in the antenatal period are an indication for hospital/obstetrician delivery. Perinatal death rates are therefore likely to be higher following hospital/obstetrician delivery, despite optimal medical management, and this represents an example of confounding by indication, with prevention of perinatal death and neonatal morbidity representing the intended effect. Restricting the contrast to low-risk women without any indication for hospital/obstetrician delivery in pregnancy or at the onset of labor may potentially avoid such confounding by indication.

A recent study from England among women without complications in pregnancy or at onset of labor showed no difference in perinatal death or serious neonatal morbidity among planned home deliveries compared with planned hospital deliveries [20]. However, this effect of planned home delivery was modified by parity; among nulliparous women, mortality/morbidity rates were significantly higher in those who had a planned home delivery, although there was no significant difference in outcome rates among multiparous women. This study likely represents a valid non-experimental assessment of planned home vs. hospital delivery, although several important caveats are noteworthy. The study involves a relatively unusual situation where there is interest in the effect of planned home vs. planned hospital delivery among women without any indication for hospital delivery. Also, the intervention is commonly available, and among women without an indication for hospital delivery, it is possible to non-experimentally recruit sufficient numbers of women into the index and reference groups. This relatively unusual situation contrasts with the absence of interest in the efficacy of anti-hypertensive drugs in preventing stroke among normotensive subjects. Although there is interest in the effect of cesarean delivery on maternal morbidity and mortality among women without an indication for this intervention, it is difficult to obtain sufficient numbers of women who undergo this procedure in the absence of any pregnancy complication. Finally, it is unclear if residual confounding by indication was completely eliminated from the above-mentioned study.

Another example of the issue of confounding by indication addressed by restriction of subjects to those without the indication is seen in studies on the effect of elective labor induction on cesarean delivery. By definition, elective labor induction at term is without indication, and cesarean delivery would be an unintended effect of such labor induction. In point of fact, several non-experimental studies [2124] have replicated the findings of randomized trials [2527] showing that elective labor induction does not increase the risk of cesarean delivery.

Matching for Indication

In certain settings, potential confounding by indication can be addressed by constituting an appropriate reference group by matching for the indication. For example, it has been suggested that studies that examined the role of aspirin in the occurrence of Reyes syndrome could have addressed confounding by indication by matching the index group of children treated with aspirin to a reference group of children treated with acetaminophen [28]. This would have eliminated confounding by the indication for aspirin use – i.e., fever and related symptoms – if aspirin use and acetaminophen use were essentially interchangeable in the study population.

In the previously mentioned study on calcium channel blockers and postpartum hemorrhage [7], the authors attempted to control for confounding by indication (hypertension) through the use of a reference group of women with hypertension treated with labetalol or methyldopa. The approach is likely to have been effective because prevention of postpartum hemorrhage is an unintended effect of antihypertensive treatment in pregnancy, and the association between hypertension and postpartum hemorrhage is modest.

Evidence of confounding by indication in studies of intended effects can sometimes provide useful information when observed with different drugs used for the same indication. In the late 1980s, fenoterol, a beta agonist used in the treatment of asthma, was implicated in asthma deaths [2931]. Since prevention of death from asthma is an intended effect of fenoterol use, confounding by indication was a potential explanation for the fenoterol-asthma death association: asthmatics with severe asthma were more likely to use fenoterol and also more likely to die of asthma compared with asthmatics not using this bronchodilator. This issue was resolved by a study that quantified the effects of fenoterol and albuterol (another beta agonist) and showed that fatal asthma was strongly associated with both of these beta agonists in a dose-dependent fashion [32]. With fenoterol and albuterol prescribed interchangeably for the same indication, the similar increase in asthma deaths with equivalent increases in fenoterol and albuterol dosage was taken to imply that a randomized study of asthmatic subjects would show no increase in asthma deaths among fenoterol users compared with albuterol users. The seminal lesson learnt (and integrated into clinical practice guidelines [33]) was that excessive beta agonist use was a marker for severe uncontrolled asthma and was associated with a high risk of asthma death (irrespective of the specific drug being overused).

Ecological Studies

Although the effect of labor induction on perinatal death is confounded by indication at the individual level, the relationship may be free of confounding at the ecological level [34]. Spatiotemporally distinct units of observation can show widely varying rates of labor induction and perinatal death, and the relationship could be informative (conditional on extraneous determinants) as it is not confounded by indication [3537]. The challenge in this situation arises from the need to accurately identify, quantify, and model other confounders and to address problems specific to ecological analyses. Examples of related studies that have attempted to address confounding by indication through ecologic means include studies on the effects of iatrogenic preterm birth in industrialized countries, in which rates of preterm birth at 32–36 weeks gestation were shown to be negatively associated with stillbirth and neonatal death at 32 weeks and beyond, suggesting that obstetric intervention was having a beneficial effect, at least with regard to these short-term outcomes [38]. However, this ecological association could have been confounded by spatiotemporal differences in prenatal diagnosis and pregnancy termination for congenital anomalies.

Statistical Techniques for Control of Confounding by Indication

In addition to the use of restriction and matching, attempts at controlling confounding by indication have included multivariable adjustment, propensity score matching, and instrumental variables. Propensity score matching involves developing a score that quantifies the probability of receiving the intervention based on individual characteristics, restricting the contrast to subjects in the index and reference groups who have overlapping scores, and adjusting the intervention-outcome relationship using the score developed [39]. Assessing intervention effects using instrumental variables involves identifying a variable that is potentially associated with the outcome of interest solely through its association with the intervention. This assumption implies no confounding of the instrument-outcome relationship despite confounding of the intervention-outcome relationship, including potential confounding by indication [39].

A study attempting to control confounding by indication non-experimentally in order to quantify the effect of adjuvant chemotherapy for preventing breast cancer recurrence showed that propensity score matching and instrumental variable analysis were not more effective than standard regression methods [40]. Contrasting the findings from this non-experimental approach with those from randomized trials suggested significant residual confounding of the adjuvant chemotherapy-cancer recurrence relationship, and the study authors concluded that confounding by indication remains a ”most stubborn bias” [40]. Reviews show that propensity score methods are increasingly used for controlling confounding by indication despite the fact that results obtained with standard multivariable methods differ from the propensity score estimates in a minority of instances [41, 42]. This similarity in results is perhaps not surprising; methods such as propensity score matching are suited for adjustment where the index and reference categories have non-overlapping distributions for specific confounders, and are not meant for controlling confounders that are unquantifiable. Multivariable and propensity score methods are also theoretically unsuitable for addressing confounding by contraindication in studies of unintended effects, especially if the contraindication is absolute (since a unique subset of the population is excluded from the index group).

The validity of analyses using instrumental variables is contingent upon the identification of a satisfactory instrumental variable [39]. The strong assumptions underlying the instrument’s relationship are largely unverifiable, and modest deviations from the assumptions can bias the results unpredictably [43, 44].

Conclusion

Non-experimental studies on the intended effects of intervention are confounded by indication, which is an intractable bias, particularly when the indication is subtle and graded. Studies on the unintended effects of intervention (that are known toxicities or known side effects) can also pose a problem in non-experimental studies, although confounding by contraindication presents less challenge insofar as restriction to a domain free of the contraindication resolves the bias. Finally, non-experimental assessment of unintended treatment effects can be confounded by extraneous aspects of the indication and, occasionally, by indication. Various non-experimental methods have been proposed to address confounding by indication, and the success of these methods is contingent upon the specific subtype of confounding at issue. Non-experimental assessment of intended effects when the indication is subtle and unquantifiable is generally not feasible and is likely to be affected by residual confounding.