Introduction

Recent epidemiological studies suggesting that some glucose-lowering agents increase the risk of cancer have received widespread publicity. The four epidemiological studies were conducted in Germany (using a health insurance fund database), the UK (using a database of anonymised medical records of general practitioners [GPs]), Sweden (using nationwide registries) and Scotland (using a national diabetes registry linked to a cancer registry) [14]. An accompanying editorial stated that the focus of these studies had been on insulin glargine (A21Gly, B31Arg, B32Arg human insulin) and that it had not been possible to place insulin detemir (B29Lys(ε-tetradecanoyl), desB30 human insulin) under similar scrutiny. The editorial also stated that further research was needed in relation to insulin glargine [5]. There were several methodological limitations of these studies [6], including short duration of follow-up and lack of cumulative exposure analyses.

One of the main challenges in observational research is confounding by indication. This is a problem in most diseases but is a specific problem with diabetes, since exposure can be defined by diabetes severity, making it very difficult to separate the effect of disease severity from treatment. Another challenge in research that uses healthcare databases is the complexity of the recorded information and the large number of choices that an investigator needs to make when extracting the information from the database. Studies from different investigators have found discrepant results and reached opposite conclusions when testing similar hypotheses within the same database [7]. Given these potential issues, the first objective of this study was to evaluate whether users of different classes of glucose-lowering agents were comparable with respect to their underlying cancer risk. The second objective was to analyse the association between exposure to glucose-lowering agents and risk of cancer, with an emphasis on analyses of the patterns of cancer risk over time after starting treatment.

Methods

Data source

This study used data from the General Practice Research Database (GPRD) in the UK. The GPRD comprises the computerised medical records maintained by GPs. GPs play a key role in the UK healthcare system, as they are responsible for primary healthcare and specialist referrals. Patients are affiliated with a practice, which centralises the medical information from the GPs, specialist referrals and hospitalisations. In the UK, the GP typically manages the prescribing for chronic diseases such as diabetes. The data recorded in the GPRD since 1987 include demographic information, prescription details, clinical events, preventive care provided, specialist referrals, hospital admissions and their major outcomes [8]. A recent review of all validation studies found that medical data in the GPRD were generally of high quality [9]. Patients in the GPRD have now been linked individually and anonymously to the national registry of hospital admission (Hospital Episode Statistics [HES]), death certificates and cancer registries in England. The linkages are performed using unique NHS numbers, dates of birth, sex and postcodes of residence of patients. The HES collects the dates of hospital admission and discharge and main diagnoses, as extracted from the medical records by coding staff in England. Cancer registries attempt to record the occurrence of all cases of cancer in a geographically defined population. The principal data sources in cancer registries are pathology reports, clinical records, hospital discharge summaries and death certificates. GP records are not used routinely in the primary case ascertainment of the cancer registries. Death certificates list the date and causes of death. At the time of the study, HES and cancer registry data from England were available continuously from April 1997 to 31 December 2006. Linked data were available for 40% of GPRD patients as, at the time of the study, this only included practices in England that were willing to provide unique patient identifiers to the trusted third party. Different coding dictionaries are being used for the various datasets (Read for GPRD [www.connectingforhealth.nhs.uk/systemsandservices/data/uktc/readcodes]; ICD-10 [www.who.int/classifications/icd/en/] for HES and cancer registries). Furthermore, the different methods for data collection varied between the datasets. Diagnoses in the HES were related to hospital admissions only and not to outpatient activity. As an example, non-melanoma skin cancer (NMSC) may rarely be recorded in the HES or cancer registries but more frequently in the GPRD. The protocol of this study was approved by the GPRD Independent Scientific Advisory Committee.

Study populations

The exposed study cohort consisted of adults aged 40 years and older with a prescription for insulin or oral glucose-lowering drugs at least one year after the start of data collection. Patients with a record of type I diabetes or a history of cancer were excluded. Within this overall exposed cohort, we identified (unmatched) inception cohorts for each class of glucose-lowering agents; a patient was included in an inception cohort if they received a first ever prescription for a class of glucose-lowering agents at least 1 year after the start of GPRD data collection. The inception cohort approach only includes new users of a medication, following patients after the start of medication. This approach allows for an evaluation of medication risks that change over time [10]. It is likely that any effects of medication on the risk of diagnosed cancer will vary over time, with small changes initially after first taking the medication followed by larger relative rates with continued use. It was hypothesised that any acceleration in the growth rate of cancer cells due to medication would be likely to present with an increasing relative rate over time (because of delays in developing symptoms and/or diagnosis). The medications of interest in this study were thiazolidinediones, insulins, metformin and sulfonylureas. Patients prescribed multi-constituent preparations were included in multiple classes of glucose-lowering agents.

Each patient in the insulin/oral glucose-lowering drug cohort was matched by age (stepwise within 5 years), sex and practice to one control patient (without history of diabetes mellitus or use of glucose-lowering drug or insulin); the index date of the control patient was that of the matched patient (taken as the date of the first ever prescription). Furthermore, the inception cohorts for thiazolidinediones, insulins and sulfonylureas were matched by age, sex and calendar year (both stepwise within 5 years) to an inception cohort of metformin users (metformin may reduce the risks of cancer).

Patients were followed from the index date up to the occurrence of the cancer of interest or the end of data collection (i.e. last GPRD data collection, transfer out of the practice or date of death, whichever date came first). Patients could belong to multiple inception cohorts if they initiated different classes of glucose-lowering agents over time. In comparisons between two classes of medication, patients were censored at the start of the comparator medication (ensuring that there was no overlapping follow-up time).

Outcomes of interest

The incident outcomes were cancer overall and individual cancer types (i.e. first ever mention of cancer irrespective of stage of cancer). Three sources were used for the cancer outcomes: (1) GPRD; (2) HES; and (3) cancer registries.

Given the different coding dictionaries used by these three datasets and different methods for data collection, analyses were conducted separately for each source of cancer outcomes. The analyses requiring HES or cancer registry data were restricted to patients from practices participating in the linkage and to those with data during the HES/cancer registry data collection period. Analyses that used GPRD for cancer outcomes used the complete study population. None of these data sources were viewed to be ‘gold standard’ without any imperfections. However, we considered that a relative rate that was consistent across the different data sources concerned a validated outcome.

Statistical analyses

Three types of analysis were conducted, as follows.

Association between diabetes mellitus and cancer risk

The first analysis concerned the association between diabetes mellitus and cancer by comparing the incidence in the overall diabetes cohort to the control cohort. Poisson regression was used to estimate relative rates. These models also included age, sex, calendar year and the following risk factors: small-area socioeconomic status (for linked practices), smoking status, use of alcohol, BMI, any previous medical history of coronary heart disease, coronary revascularisation, hyperlipidaemia, hypertension, peripheral vascular disease, renal impairment, stable angina, and any prescription in the previous 6 months for angiotensin II receptor blockers, antiplatelets, beta blockers, calcium channel blockers, diuretics, nitrates, non-steroidal anti-inflammatory drugs, aspirin or statins. Small-area socioeconomic status, smoking status, use of alcohol and BMI were handled as categorical variables, with a separate category for missing data.

Bias and confounding in the comparisons between different glucose-lowering agents

The second analysis evaluated whether relative rates of cancer varied between different glucose-lowering agents during the first 3 months after starting treatment. We hypothesised that the onset of effect on the risk of clinically diagnosed cancer would not be rapid (within days) but rather take several months from the start of treatment to clinical diagnosis. If this hypothesis of delayed onset is indeed true, the rates of cancer shortly after starting treatment could provide information about the comparability of the different classes of glucose-lowering agents. An increased relative rate of cancer during the first months of treatment would suggest that unobserved patient characteristics and cancer risk factors vary between the different glucose-lowering agents and that comparisons between these agents may be confounded. Poisson regression was used to estimate relative rates, comparing cohorts of users of thiazolidinediones, insulin and sulfonylureas to the unmatched cohort of metformin users. The models estimating adjusted relative rates included the variables listed in the previous section as well as prescriptions in the previous year of other types of glucose-lowering agent (this was measured in a time-dependent manner at 3 month intervals). This analysis was repeated looking at the first 6 months of treatment.

Patterns of risk

The patterns of cancer incidence over time within each inception cohort were also evaluated, in three ways. First, Poisson regression analysis was used to compare the incidence in the first 6 months after starting the medication to that in the periods 6–24, 25–60 and >60 months. The end of follow-up was the end of data collection and the statistical adjustment was based on the variables listed above, including the prescribing of other glucose-lowering agents. The second pattern analysis further evaluated any changes in risk by estimating the cancer rates within small periods of time (rather than categorising follow-up into broad intervals, as was done in the previous analysis). The follow-up period (from start of medication until end of data collection) was divided into 100 periods. The incidence rates (hazard rates) at each point in time were then estimated, followed by a smoothing over time of these rates [11]. This analysis compared the cohorts of users of thiazolidinediones, insulin and sulfonylureas to the matched cohort of metformin users. It was expected that the relative rates would increase over time if there was an adverse effect of cancer due to the medication. The third pattern analysis evaluated whether adjusted relative rates varied over time using the test for proportionality in Cox proportional regression. This test evaluates whether a relative rate changes over time. The follow-up period was also divided by quintiles of time since first taking the medication and adjusted relative rates estimated within each quintile.

Results

Demographics

The overall study population included 206,940 patients prescribed with insulin or oral glucose-lowering drugs and the same number of controls without diabetes. Table 1 shows the baseline characteristics of the four inception cohorts for metformin, sulfonylureas, thiazolidinediones and insulins, as identified within the overall exposed cohort. As expected, there were observed differences in risk factors between metformin, sulfonylureas and insulin. Metformin was more often the first diabetes treatment, while insulin users had more frequent history of use of other diabetes treatments.

Table 1 Baseline characteristics at inception date of metformin, sulfonylureas, thiazolidinediones and insulin

Association between diabetes mellitus and cancer risk

As shown in Table 2, patients with diabetes did not have an increased risk of cancer overall (any type excluding NMSC) compared with patients without diabetes. The adjusted relative rates were 0.99 (95% CI 0.97, 1.02) for cancer recorded in the GPRD and 0.96 (95% CI 0.91, 1.01) for cancer recorded in the registries. Analysing the individual cancer types, decreased rates were observed for NMSC, prostate and lung cancer and increased rates for pancreatic, liver, uterus and cervical cancer. Results were broadly similar for outcomes recorded in the GPRD, cancer registries or the HES. In the subset of diabetes patients with a baseline HbA1c measurement, there was no difference in overall cancer risk and quartile of HbA1c value. The adjusted relative rate was 1.08 (95% CI 0.99, 1.18) with HbA1c of 7.3–8.1% (56–65 mmol/mol), 0.99 (95% CI 0.90, 1.08) with HbA1c of 8.1– 9.6% (65–81 mmol/mol) and relative rate of 1.02 (95% CI 0.93, 1.11) with HbA1c of >9.6% (81 mmol/mol), compared with those with HbA1c of <7.3% (56 mmol/mol).

Table 2 Relative rates of different types of cancer (as recorded in the GPRD, HES or cancer registry) during the total follow-up period in patients with and without diabetes

Bias and confounding in the comparisons between different diabetic medications

Table 3 shows the results of the bias analyses. The rates of cancer were higher during the first 3 months of insulin treatment compared with that during the first 3 months of metformin treatment (in metformin users without a history of insulin use). The crude relative rate was 1.87 (95% CI 1.58, 2.22). Statistical adjustment for risk factors (as measured in the GPRD) did not change the relative rate substantially (adjusted relative rate of 1.93 [95% CI 1.56, 2.39]). For the other glucose-lowering agents, there were similar results, indicating lack of comparability with metformin and residual confounding. The results were similar when evaluating the rates of cancer during the first 6 months of treatment.

Table 3 Rates of cancer (any type excluding NMSC as recorded in the GPRD) during the first 3 or 6 months of treatment only, comparing thiazolidinedione, insulin or sulfonylurea to metformin (bias analysis)

Patterns of risk

As shown in Table 4, insulin users showed a pattern of decreasing cancer incidence over time: adjusted relative rate of 0.58 (95% CI 0.50, 0.68) during months 6–24, relative rate of 0.50 (95% CI 0.42, 0.59) during months 25–60 and relative rate of 0.48 (95% CI 0.40, 0.59) during months 60+ compared with the first 6 months after starting treatment. Similar patterns were found with sulfonylureas and metformin. Given the relatively high incidence shortly after treatment, the rates of pancreatic cancer decreased substantially over time. Figure 1 shows the crude ratios of smoothed hazard rates in thiazolidinediones, insulins and sulfonylureas compared with matched metformin users. There was no pattern of increasing relative rates over time with insulin and sulfonylureas compared with metformin. When dividing the follow-up period by quintiles of time starting medication, adjusted relative rates also did not increase over time (Table 5). There were no statistically significant interactions between time since first taking the medication and adjusted relative rate for sulfonylureas and insulin compared with metformin.

Table 4 Relative rates of different types of cancer (as recorded in the GPRD) over time within each inception cohort of thiazolidinedione, insulin, sulfonylurea or metformin treatment
Fig. 1
figure 1

a Crude hazard relative rates and 95% CI for cancer (excluding NMSC) over time in thiazolidinedione users compared with matched metformin users. b Crude hazard relative rates and 95% CI for cancer (excluding NMSC) over time in sulfonylurea users compared with matched metformin users. c Crude hazard relative rates and 95% CI for cancer (excluding NMSC) over time in insulin users compared with matched metformin users

Table 5 Adjusted relative rates of any type of cancer excluding NMSC (as recorded in the GPRD) within different periods of time after starting medication comparing inception cohorts of thiazolidinedione, insulin and sulfonylurea to the matched inception cohort of metformin

None of the different insulin types showed a pattern of increasing incidence rates over time (Table 6). Patients starting with insulin glargine had statistically comparable cancer risk over time (adjusted relative rate of 0.70 [95% CI 0.52, 0.95] during months 6–24, relative rate of 0.77 [95% CI 0.56, 1.07] during months 25–60 and relative rate of 0.60 [95% CI 0.36, 1.02] during months 60+ compared with the first 6 months with insulin glargine).

Table 6 Relative rates of cancer (as recorded in the GPRD) over time within each inception cohort of different types of insulin

In a sensitivity analysis, patients were censored 3 months after the last prescription for the class of glucose-lowering agents of interest (i.e. at discontinuation of treatment). Compared with the first 6 months, the adjusted relative rates for metformin were 0.76 (95% CI 0.70, 0.83) during month 6–24, 0.78 (95% CI 0.72, 0.85) during months 25–60 and 0.82 (95% CI 0.75, 0.90) during months 60+. For thiazolidinediones, these were 0.98 (95% CI 0.82, 1.18), 1.04 (95% CI 0.86, 1.25), 1.11 (95% CI 0.86, 1.44); for insulin, 0.59 (95% CI 0.51, 0.69), 0.52 (95% CI 0.44, 0.60), 0.51 (95% CI 0.43, 0.61); for sulfonylureas, 0.60 (95% CI 0.54, 0.65), 0.62 (95% CI 0.56, 0.68) and 0.62 (95% CI 0.56, 0.69), respectively. Our definition of incident use (i.e. first ever prescription in the GPRD at least 12 months after the start of data collection) was also evaluated by analysing the percentage of patients with a gap of >12 months between subsequent prescriptions in the GPRD. Only a small percentage of patients had this gap in their prescription histories (metformin, 7.4%; insulins, 3.3%; sulfonylureas 8.2%; thiazolidinediones 5.5%).

Discussion

The results of this study are complex. The bias analyses found that there were substantive differences in cancer risk between the various classes of glucose-lowering agents during the first months of treatment (lowest risks for thiazolidinediones and highest for insulins). Statistical adjustment did not resolve these differences. This indicates that users of these agents may not have been comparable with respect to their underlying cancer risk. Higher risks of cancer shortly after starting insulin or sulfonylurea treatment may be explained by protopathic bias (i.e. early cancer leading to unstable diabetes and hyperglycaemia, with patients switching diabetes treatment). The bias analysis was followed by an analysis evaluating the changes in risk of cancer over time. No evidence was found of increasing rates of cancer within the cohorts of insulin and sulfonylurea users, which would be indicative of a side-effect with a delayed onset of effect. The next analysis compared the cancer risks over time with metformin users. If metformin decreases and/or insulin increases the risk of cancer (both have been proposed), it is expected that the rates of cancer would diverge between these groups, with increasing relative rates over time. Although we analysed this in several ways, no evidence was found to support any increase in rates of cancer over time with insulin compared with metformin. The present study used an inception cohort approach, which is particularly suited to evaluating medication with effects that vary over time. A study that includes both incident and prevalent users would be less able to detect time-dependent medication effects.

The present study and a recent large UK study linking hospital admission records and death certificates [12] found that diabetes was not associated with an increased risk of cancer overall, as both reported relative rates close to unity. With respect to the association between type of cancer and diabetes, pancreatic cancer is one of the types most strongly associated with diabetes, as found in the present study and reported in literature [12, 13]. The most likely explanation for this increased risk of pancreatic cancer is protopathic bias; an alternative explanation is detection bias. The presence of bias is supported by the pattern of cancer incidence over time (i.e. high incidence shortly after starting treatment substantially dropping over time). It is unlikely that glucose-lowering agents can cause pancreatic cancer within days of starting medication. Decreased rates of prostate cancer and NMSC have been reported in the literature [12, 13] and also found in the present study. But the evidence is inconsistent for colorectal and breast cancer. Our findings of absence of association between baseline HbA1c and cancer risk is consistent with the results of a recent meta-analysis of major randomised controlled trials that reported that cancer risk was not associated with the level of glycaemic control [14]. A large randomised controlled trial found no difference in cancer risk between patients treated with intensive and standard glucose control [15]. Thus, diabetes mellitus is associated with an increased risk of some site-specific cancers and a reduced risk of others, but there is no strong evidence indicating an increased risk of cancer overall.

A large number of studies have evaluated the cancer effects of different classes of glucose-lowering agents, and most suggested that there are effects on cancer [1631]. Almost all studies compared different glucose-lowering agents, or compared patients prescribed glucose-lowering agents with patients without diabetes, relying on regression analysis to deal with confounding. The present study found that there was a substantial residual bias; although differences were identified in cancer incidence between treatments during the first 3–6 months, this is unlikely to be a causal effect. Statistical adjustment alone did not minimise this confounding and bias; the effect estimates in the first 3–6 months did not change substantially with adjustment. Another important consideration in the assessment of a causal relationship is the biological plausibility of an association. For each of the glucose-lowering agents, biological mechanisms have been postulated for adverse or beneficial cancer effects of these medications. However, biological plausibility may not be sufficient to accept a causal effect, as exemplified by the lack of effects of hormone therapy on cardiovascular disease or statins on fracture, despite apparent biological plausibility [7, 32]. Metformin has been reported to decrease cancer risk while insulin may increase it. The hypothesis is that insulinaemia brought about by exposure to insulin or sulfonylureas will accelerate the growth rate and presentation of cancer, whereas exposure to metformin will prevent growth [3]. However, the present study did not find diverging cancer rates over time, as would be expected with these opposite effects. If this hypothesis of growth acceleration is correct, one would not expect relative rates of 1.8 within weeks after starting treatment. We are unaware of any evidence supporting such a large immediate increase. A cell growth hypothesis would be consistent with slowly increasing relative rates over time (i.e. a quasi-exponential increase). The most likely explanation for the findings in this study is bias due to lack of comparability and protopathic bias. As a consequence of treatment guidelines, glucose-lowering agents may be used differently, with the lowest underlying cancer risks for thiazolidinediones and the highest for insulins. Patients with unstable diabetes are likely to switch class of treatment, with insulin used in patients with diabetes not controlled by oral glucose-lowering drugs. The original Scottish study also reached the conclusion that the excess cancer cases found among insulin glargine users were more likely to reflect bias rather than a causal effect [4].

The findings that insulin glargine could be associated with an increased risk of cancer received considerable attention. The journal Diabetologia diligently commissioned a further three studies in Scotland, Sweden and the UK after receiving the results of the first German study and before publishing this signal of drug toxicity [14]. However, the editorial commented that, although the studies required further analysis and evaluation, the implications of these four studies ‘are likely to be very far-reaching’ [5]. With the benefit of hindsight and with the knowledge as presented in the present study, we do not believe that these studies warranted this level of publicity or concern. Although evaluating the same hypotheses, the study designs and analyses were very different between the four studies. As outlined by Pocock and Smeeth, the studies suffered from various methodological weaknesses [6]. Several of these studies excluded patients based on events happening after the index date, which is an incorrect approach where censoring should be used instead. The UK study, which found that patients on insulin were more likely to develop cancer than those on metformin, also used a GP database (the commercial database THIN) [3]. We were unable to replicate the findings of this study. Replication of study findings is pivotal given the complexity of the databases and analyses. We propose that major safety signals of drug toxicity should first be replicated with a quantitative analysis of bias before publication in a major journal.

There are various strengths and limitations to this study. The present study included a large number of patients, and cancer outcomes were obtained through three independently collected databases, including prospectively collected cancer registry data. However, the information on confounders and underlying disease severity was limited in this study. Furthermore, our analyses provide only simplistic representations of the actual exposures to glucose-lowering agents. Drug exposure in actual clinical practice often varies greatly, with many different drug combinations being used and patients switching over time between drugs and patients being non-compliant with treatment instructions. We did not evaluate this complexity in exposure and also relied on information on prescriptions rather than actual use. This present study used an inception cohort approach, following patients from the start of treatment in the GPRD. However, we did not have life-time exposure histories and some misclassification of time of exposure may thus have occurred. An analysis of gaps between prescriptions found that only a small proportion of patients had substantive gaps in their prescription histories, suggesting that this misclassification was not major. Furthermore, we are unable to provide solid causal explanations (rather than mere conjecture) for all findings in this study.

In conclusion, the present study does not support any beneficial or adverse effects of glucose-lowering agents on cancer risk. Our findings suggest that changes in diabetes treatment occur in the few months prior to the diagnosis of cancer, leading to increased cancer rates shortly after starting treatment (indicating protopathic bias). There were no differences in cancer risk longer-term between metformin compared with insulin and sulfonylureas. This finding of proportional rates over time does not support a beneficial effect of metformin or an adverse effect of insulin or sulfonylureas. Given the complexity of medication use in actual clinical practice, we strongly advocate that epidemiological studies are replicated independently and tested for methodological weaknesses before publication.