Article Text

Download PDFPDF

Cohort Profile: the Predictors of Breast Cancer Recurrence (ProBe CaRE) Premenopausal Breast Cancer Cohort Study in Denmark
  1. Lindsay J Collin1,
  2. Deirdre P Cronin-Fenton2,
  3. Thomas P Ahern3,
  4. Peer M Christiansen4,5,6,
  5. Per Damkier7,8,
  6. Bent Ejlertsen6,9,
  7. Stephen Hamilton-Dutoit10,
  8. Anders Kjærsgaard2,
  9. Rebecca A Silliman2,11,
  10. Henrik Toft Sørensen2,12,
  11. Timothy L Lash1,2
  1. 1 Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, Georgia, USA
  2. 2 Department of Clinical Epidemiology, Aarhus University Hospital, Aarhus, Denmark
  3. 3 Department of Surgery, The Robert Larner, M.D. College of Medicine at The University of Vermont, Burlington, Vermont, USA
  4. 4 Breast Unit, Surgical Department, Randers Regional Hospital, Randers, Denmark
  5. 5 Department of Oncology, Odense University Hospital, Odense, Denmark
  6. 6 Danish Breast Cancer Group, Copenhagen, Denmark
  7. 7 Department of Clinical Biochemistry and Pharmacology, Odense University Hospital, Odense, Denmark
  8. 8 Department of Clinical Research, University of Southern Denmark, Odense, Denmark
  9. 9 Rigshospitalet, Copenhagen, Denmark
  10. 10 Institute of Pathology, Aarhus University Hospital, Aarhus, Denmark
  11. 11 Boston University School of Medicine, Boston University, Boston, Massachusetts, USA
  12. 12 Department of Health Research & Policy (Epidemiology), Stanford University, Stanford, California, USA
  1. Correspondence to Lindsay J Collin; lindsay.jane.collin{at}


Purpose The Predictors of Breast Cancer Recurrence (ProBe CaRe) study was established to evaluate modification of tamoxifen (TAM) effectiveness in premenopausal women through reduced activity of TAM-metabolising enzymes. It comprehensively evaluates the effects of pharmacogenetic variants, use of concomitant medications and biomarkers involved in oestrogen metabolism on breast cancer recurrence risk.

Participants The ProBe CaRe study was established using resources from the Danish Breast Cancer Group (DBCG), including 5959 premenopausal women diagnosed with stage I–III primary breast cancer between 2002 and 2010 in Denmark. Eligible participants were divided into two groups based on oestrogen receptor alpha (ERα) expression and receipt of TAM therapy, 4600 are classified as ERα+/TAM+ and 1359 are classified as ERα−/TAM−. The ProBe CaRe study is a population-based cohort study nested in a nearly complete source population, clinical, tumour and demographic data were abstracted from DBCG registry data. Linkage to Danish registries allows for abstraction of information regarding comorbid conditions, comedication use and mortality. Formalin-fixed paraffin-embedded tissue samples have been prepared for DNA extraction and immunohistochemical assay.

Findings to date To mitigate incorrect classification of patients into specific categories, we conducted a validation substudy. We compared data acquired from registry and from medical record review to calculate positive predictive values (PPVs) and negative predictive values. We observed PPVs near 100% for tumour size, lymph node involvement, receptor status, surgery type, receipt of radiotherapy, receipt of chemotherapy and TAM treatment. We found that the PPVs were 96% (95% CI 83% to 100%) for change in endocrine therapy and 61% (95% CI 42% to 77%) for menopausal transition.

Future plans The ProBeCaRe cohort study is well positioned to comprehensively examine pharmacogenetic variants. We will use a Bayesian pathway analysis to evaluate the complete TAM metabolic path to allow for gene–gene interactions, incorporating information of other important patient characteristics.

  • cohort study
  • breast tumours
  • pharmacogenetics
  • epidemiology

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

View Full Text

Statistics from

Strengths and limitations of this study

  • One potential limitation of the Predictors of Breast Cancer Recurrence study is the homogeneity of the study sample, as almost all are of European descent.

  • In addition to being the first large epidemiological study to examine reduced activity of tamoxifen metabolism in premenopausal women, this study is strengthened by completeness of high-quality data.

  • Our study includes a validation substudy to mitigate errors from the incorrect classification of patients into specific categories of key analytical variables.


Endocrine therapy improves survival in patients with breast cancer regardless of axillary lymph node status.1 The Predictors of Breast Cancer Recurrence (ProBe CaRE) cohort study was established to evaluate modification of tamoxifen (TAM) effectiveness in premenopausal women through reduced activity of TAM-metabolising enzymes. Candidates for adjuvant TAM therapy include patients with stage I–IV breast cancer with oestrogen receptor (ER) positive tumours, who constitute about two-thirds2 of the approximately 1.7 million newly diagnosed patients with breast cancer each year worldwide.3 Current guidelines recommend that premenopausal patients with ER alpha positive (ERα+) cancers receive TAM for 5–10 years,4–6 which reduces recurrence risk by nearly half,6 and that TAM may be offered to postmenopausal women with ERα+ cancers as an alternative to aromatase inhibitors. TAM metabolism is complex, but is principally catalysed by cytochrome P450 enzymes. Some metabolites bind with the ER with significantly greater affinity than TAM itself, especially endoxifen, which has the highest ER-binding activity among TAM metabolites. Activity of the enzymes involved in TAM metabolism can vary between individuals due to inherited gene variants7–11 or use of comedications.7 8 Although many studies have explored the association between these gene variants or use of comedications and failure of TAM treatment,12 13 which manifests clinically as a recurrence, the interpretation of these studies remains controversial. Current clinical guidelines do not recommend genotyping these variant alleles to support treatment decisions,1 5 14 but do recommend avoiding inhibiting comedications.15

To date, little available evidence on this topic is specific to premenopausal patients with breast cancer. The competition between oestrogen and TAM for ER binding is highly important for these patients because TAM is a first-line guideline-recommended therapy1 5 for premenopausal patients and because premenopausal women have higher concentrations of oestrogens to compete with TAM for ER binding. Oestradiol (E2), the most active oestrogen metabolite, binds with the ER with approximately the same affinity as endoxifen.16 Premenopausal women have 10-fold higher concentrations of E2 than postmenopausal women17 and E2 concentrations tend to increase during TAM therapy.17 18 This suggests that inhibition of TAM-metabolising enzymes is more likely to decrease effectiveness in premenopausal women, yet they have been seldom studied in this topic area.

Research questions

We established a premenopausal cohort of patients with breast cancer to fill this important evidence gap, with the following primary study aims:

Assess pharmacogenetics of TAM metabolism and risk of breast cancer recurrence

We will assess the pharmacogenetics of TAM metabolism by genotyping 32 variants in 15 enzymes (table 1) thought to affect the concentration of the most active TAM metabolites, and will evaluate the association between these variants and breast cancer recurrence in TAM-treated premenopausal patients with breast cancer. Each of the selected enzymes is involved in at least one step in the TAM metabolic pathway (figure 1). Interactions with comedications that inhibit these metabolic enzymes also will be evaluated.

Table 1

Selected functional variants and inhibitor comedications in genes whose enzymes metabolise tamoxifen

Figure 1

Metabolic pathway of tamoxifen and related metabolites including enzymes that have been genotyped. ER, oestrogen receptor.

Assess the interaction between the pharmacogenetics of TAM metabolism and ER beta (ERβ) expression

We will assess the effect of interaction between the pharmacogenetics of TAM metabolism and ERβ expression on risk of breast cancer recurrence. Previous studies have shown that coexpression of ERβ is associated with improved survival among patients with ERα+ tumours who are treated with TAM.19 20 The ERβ receptor opposes ERα-mediated proliferation.19 Tumours that express both ERα and ERβ are less aggressive than tumours that express homodimers of ERα,19 21 due to the attenuated stimulation response from ERα/ERβ heterodimers. This suggests that metabolic inhibition may only affect ERβ− tumours. In vitro, analyses have demonstrated that in ERα+/ERβ+MCF7 cells, proliferation is inhibited by a wide range of endoxifen concentrations.22 Still, in ERα+/ERβ– MCF7 cells, only physiologically high endoxifen concentrations inhibit proliferation,22 indicating that metabolic inhibition affects risk of recurrence only when ERβ is absent.

Assess interaction between inhibition of TAM metabolism and oestrogen-regulating enzymes

Finally, we will assess the association between tumour expression of 17β-hydroxysteroid dehydrogenase 1 and 2 (17βHSD1 and 17βHSD2) and breast cancer recurrence. 17βHSD1 catalyses the conversion of oestrone to the most potent form of oestrogen, E2 and 17βHSD2 catalyses the reverse reaction.23 E2 has the highest binding affinity for ER, and endoxifen acts through competitive inhibition at the receptor-binding site.23 In breast tumour tissue, 17βHSD1 is more highly expressed than 17βHSD2. The opposite is usually observed in adjacent normal tissue.24 Tumours with higher capacity to produce E2 endogenously through increased expression of the 17βHSD1 enzyme are more likely to overwhelm the TAM metabolites in competition for ER binding, affecting TAM effectiveness. These enzymes are ideal therapeutic targets to modulate E2 concentrations in tumour cells, and candidate inhibitors have been developed.23 25 We will evaluate whether disequilibrium of the 17βHSD1 and −2 enzymes (ratio >1) results in compromised TAM effectiveness.

Cohort description

The ProBe CaRe cohort was established using the resources of the Danish Breast Cancer Group (DBCG) registry. The DBCG registry was established in 1976 and began to register patients in 1977, with the goals of standardising treatment, facilitating clinical trials and monitoring outcomes among Danish patients with breast cancer.26 Since its inception, the DBCG has registered over 90% of women diagnosed with breast cancer in Denmark. Patients with breast cancer are registered in the DBCG via standardised forms. The registry has a standard protocol to collect information on tumour, treatment and patient characteristics. Using this information-rich resource, the ProBe CaRe cohort is nested in a nearly complete source population of premenopausal women diagnosed with stage I–III first primary breast cancer between 2002 and 2010 whose breast cancer was reported to the DBCG. In Denmark, all citizens and legal residents are assigned a Civil Personal Register (CPR) number, a unique 10-digit personal identifier assigned at birth or on immigration that is used for identification across all national registries electronic online supplementary figure 1.27

Supplementary file 1

Of the 8047 premenopausal women diagnosed with breast cancer between 2002 and 2010 and recorded in the DBCG registry, 5959 cancers were identified as eligible based on being a stage I–III first primary breast cancer and untreated with neoadjuvant therapy; all others (n=2088) were excluded. The 5959 eligible patients then were divided into two cohorts based on ERα expression and receipt of TAM therapy (figure 2). To address competing explanations (eg, if the biomarkers under study affect risk directly rather than mediating the TAM effect), we will also evaluate the risk of recurrence in the subset of women with ERα− tumours who did not receive TAM therapy (T−).

Figure 2

Selection of study sample and group based on the inclusion criteria. The source population consisted of 8047 premenopausal women diagnosed with a first primary stage I–III breast cancer and reported to the Danish Breast Cancer Group between 2002 and 2010. After exclusions (n=2088), the study population consists of 5959 patients in the ProBe CaRe study. ER, oestrogen receptor; ERα, oestrogen receptor alpha; ProBe CaRe, Predictors of Breast Cancer Recurrence; TAM, tamoxifen therapy.

Our final ProBe CaRe study population consists of these 5959 patients with breast cancer divided into a cohort of ERα+/T+ (4600 patients) and a cohort of ERα−/T− (1359 patients). The sociodemographic and clinical characteristics of the two cohorts are described in table 2. The distribution of the clinical and demographic characteristics between the two cohorts (ERα+/T+ vs ERα−/T–) are relatively similar. They only differ meaningfully with respect to progesterone receptor (PR) status (58% vs 1.4% PR+, respectively) and human epidermal growth factor receptor 2 (HER2) status (14% vs 26% HER2+, respectively). With respect to outcomes, the ER−/TAM− cohort has a higher proportion of subjects who experienced recurrence (8.6% vs 16%, respectively) and who died by the end of follow-up (7.8% vs 18%, respectively). This pattern is to be expected, as ER− breast cancers generally have a worse prognosis than ER+ cancers, especially within the first 5 years following diagnosis.28 29

Table 2

Distribution of clinical and tumour characteristics by ER status and receipt of tamoxifen among the 5959 participants in a population-based cohort of premenopausal women diagnosed with first primary breast cancer, ProBe CaRe study

Cohort follow-up

Women diagnosed with breast cancer and subsequently enrolled in the DBCG registry undergo semiannual examinations during the first 5 years after diagnosis and annual examinations during years 6–10.30 Women undergoing treatment for breast cancer receive endocrine therapy through the Danish government and obtain their medicine at the hospital, which will be used to estimate TAM adherence. Members of both the ERα+/T+ and the ERα−/T− ProBe CaRe cohorts have been followed from breast cancer diagnosis to the first of (1) recurrence, (2) death, (3) 10 years of follow-up, (4) loss to follow-up due to emigration, (5) another malignancy or (6) the end of the study follow-up period. Breast cancer recurrence was identified using the DBCG registry. We adopted the DBCG definition of breast cancer recurrence as any type of breast cancer diagnosed subsequent to the initial course of therapy.30 Recurrences are then further categorised as locoregional (in the scar or regional lymph nodes), contralateral (opposite breast), distant (all other sites) or unknown (site of recurrence not documented). The date of recurrence is recorded in the DBCG registry, including recurrences diagnosed between scheduled follow-up exams. Mortality and emigration were identified using the Danish Civil Registration System, which is updated daily.27 Emigration is the only expected source of loss to follow-up and has impacted less than 1% of the study population.

Data collection

Registry data

Once participants eligible for ProBe CaRe were selected from the DBCG registry, we extracted clinical and demographic information from the DBCG registry. This information included date and place of diagnosis, tumour characteristics, treatment received and patient characteristics, which are presented in electronic online supplementary table 1. We also extracted information on comorbid diseases at time of breast cancer diagnosis, summarised using the Charlson Comorbidity Index.31 32 The registry information allows us to update the participants’ conditions during study follow-up and therefore to account for time-varying exposures and confounding factors. The CPR number for each patient was used to link cohort members to the Danish National Prescription Registry,33 which provided information on filled prescriptions of drugs that inhibit TAM-metabolising enzymes. This allowed us to assess the drug–drug interactions discussed above.


The CPR number for each patient was used to identify the hospital at which the surgery was performed and to locate and retrieve the corresponding formalin-fixed paraffin-embedded (FFPE) tissue samples. The list of ProBe CaRe cohort members and their CPR numbers and hospitals of diagnosis were provided to a medical research technician at the Institute of Pathology, blinded to whether the CPR numbers corresponded to a patient with a recurrence. The technician reviewed a description of the available tissue blocks (routinely available in the Danish pathology registry34), identified the tumour-rich and non-neoplastic blocks for each patient, and specified which FFPE blocks should be requested from the hospitals. This list of blocks to be requested was then returned to the Department of Clinical Epidemiology, which prepared and mailed the request letters to the pathology archives at the respective Danish hospitals. Staff at the hospital pathology archives returned the blocks to the Department of Clinical Epidemiology, which assigned a project identification number to the block and then provided it to the Institute of Pathology. The project identification number maintained blinding of laboratory personnel to whether blocks corresponded to patients with a recurrence. The Department of Clinical Epidemiology maintains the key linking the project identification number for the blocks to the clinical data, including recurrence status.

Non-neoplastic tissue samples are taken routinely from normal adjacent tissue or cancer-free lymph nodes resected during breast cancer surgery and were used as controls in creation of tissue microarrays (TMAs). Of the 4600 patients included in the ERα+/T+ cohort, 4599 patients had samples evaluated by clinicians, and tumour samples were available for 3959 (86%). Among the ER−/TAM− cohort, 1139 (84%) patients had tumour samples available. Distribution of clinical and demographic characteristics among patients with and without, available tumour samples are described in table 3. Of patients included in the ERα+/T+ cohort, 2746 (82%) had a non-neoplastic tissue sample available, while 1082 (80%) patients in the ER−/TAM− cohort had non-neoplastic tissue samples available. Distributions of demographic and clinical characteristics among patients with and without non-neoplastic tissue samples are summarised in electronic online supplementary table 2.

Table 3

Summary of exposure, covariate and outcome variables collected in the ProBe CaRe study

Sections of collected tissue blocks have been prepared for DNA extraction and immunohistochemical (IHC) assay. In accordance with the study’s primary aims, we will genotype 32 variants across 15 genes known to be involved in TAM metabolism, in order to predict extent of metabolic inhibition. We will also reassay ERα expression to ensure correct classification of the two cohorts, as original ERα expression was measured in different pathology laboratories using different methods. In our previous case–control study of postmenopausal patients diagnosed during 1985–2001, we reported 95% concordance of positive ERα expression between initial assays and reassays and 74% concordance of negative ERα expression between initial assays and reassays.35 ERβ expression will be assayed using IHC in TMAs to assess its possible modification of TAM metabolic inhibition. Expression of the enzymes 17βHSD1 and 17βHSD2 also will be assayed using IHC to address the study aim examining whether the ratio of these two enzymes modulates TAM’s efficacy in preventing breast cancer recurrence. The aforementioned assays of biomarkers are the primary starting point. However, we anticipate that the study will yield a substantial tumour biobank and ultimately provide a valuable resource to researchers for further characterisation of prognostic and predictive biomarkers in premenopausal breast cancer.

Validation substudy

Registry data are not error-free.36 To mitigate incorrect classification of patients into specific categories, we conducted a validation substudy.37 By comparing data procured both from the registry and from medical record review, we calculated positive predictive values (PPVs) and negative predictive values and their corresponding CIs for key analytical variables. We observed near perfect PPVs for tumour size, lymph node involvement, receptor status, surgery type, receipt of radiotherapy, receipt of chemotherapy and TAM treatment. The PPVs were 96% (95% CI 83% to 100%) for change in endocrine therapy and 61% (95% CI 42% to 77%) for menopausal transition. While the PPV for DBCG-recorded recurrence was 100%, there were more recurrences reported in the medical records than reported in the DBCG database.37 These parameters will allow us to adjust for measurement errors in our analyses, improving data quality and confidence in the resulting measures of association.38

Patient and public involvement

Patients and public were not involved in the development of this study.

Findings to date

In our preceding ProBe CaRe nested case–control study, where 94% patients with breast cancer were postmenopausal, we compared rates of breast cancer recurrence for women with a polymorphism that impairs the function of CYP2D6 (an enzyme involved in TAM metabolism) to those in women without this polymorphism and found a null association (adjusted OR 0.99; 95% CI 0.76 to 1.3).39 We further evaluated functional variants in the phase II UDP-glucuronosyl transferases, which contributes to deactivation TAM, and again found near null associations.40 We have assessed drug–drug interactions with concomitant use of selective serotonin reuptake inhibitor antidepressants using the Danish National Prescription Registry, and reported an adjusted OR of 1.1 (95% CI 0.7 to 1.7).41

The current ProBe CaRe longitudinal cohort will build on our previous research to address gene–gene and gene–drug interactions that may compromise TAM effectiveness by focusing on premenopausal women and by more comprehensively evaluating variants in the metabolic path. We will use a Bayesian pathway analysis, (the Algorithm for Learning Pathway Structure (ALPS)),42 to evaluate the complete TAM metabolic pathway and to allow for identification of gene–gene interactions, while also estimating the net effect of the entire pathway.43 This analytical approach will allow for incorporation of time-varying information on TAM adherence, use of inhibiting comedications, comorbidity and transition to postmenopausal status, while modelling complex gene–gene interactions without issues of sparse data or a reduction in power.42 ALPS also permits incorporation of prior biological knowledge regarding the metabolic path of TAM, so that the search space for the algorithm is constrained to pathways consistent with currently understood biology.

The DBCG has a long history of contributions to the scientific community, informing clinical and treatment guidelines for breast cancer.26 30 44 It is thus an indispensable resource for addressing our study aims.

Strengths and limitations

The current ProBe CaRe study is a large prospective cohort nested within a nearly complete source population. The cohort has many strengths, including the completeness of high-quality data and a large representative study population from the Danish source population. Our study design allows for thorough assessment of competing explanations for our findings, both through inclusion of a cohort of ER−/TAM− participants and an internal validation study to address possible errors in classification of key variables. It is the first cohort to examine reduced activity of TAM metabolism in premenopausal women with ample sample size. Moreover, all data (except for new laboratory data) were collected from standardised reports submitted to population-based prospective registries. In addition to DCBG data, we can link patient records to drug prescription, morbidity and mortality data from independently maintained registries to ensure that relevant covariates are considered.

One potential limitation of the ProBe CaRe cohort is the homogeneity of the study sample, as almost all patients are of European descent. However, there is no comparable source with the same level of information quality to allow exploration of our aims in a more diverse study population. Lack of diversity is a potential limitation, but previous studies indicate that our findings may be extrapolated to external populations and can inform the future direction of research in more diverse populations.45–48


ProBe CaRe study data are held and managed by the Department of Clinical Epidemiology in Aarhus, Denmark. We welcome collaborations to enhance the utility of the data and biobank and will respond to all inquiries (


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
View Abstract


  • Contributors Contributors: LJC prepared the original draft of the manuscript. AK conducted data analyses and put together the tables. TLL, DPC-F, HTS, SH-D and TPA were responsible for study development and planning. DPC-F and HTS were responsible for application for data access in Denmark. SH-D led the collection and preparation of the tumour samples for genotyping and immunohistochemistry assays. PD provided methodological input surrounding the pharmacogenetic aspects of the study. PMC, BE and RAS provided methodological insight into study design, operationalisation of the study aims and clinical insights. All authors provided critical review of the manuscript and approved the final version.

  • Funding The ProBe CaRe cohort study was established with funding from the National Cancer Institute at the US National Institutes of Health (R01 CA166825; PI: Lash).

  • Competing interests None declared.

  • Patient consent Not required.

  • Ethics approval All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards. The research is approved by the regional ethical board in Denmark and by the institutional review boards in the USA. The study does not contain any animal experiments performed by any of the authors.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Once the initial data analyses are complete, we will be open to collaborations with outside investigators as permitted by the IRBs of participating sites. In particular, we will encourage collaborations with researchers whose expertise is under-represented on our research team. To become a collaborator, a researcher will be required to submit an application, which will undergo both a scientific and IRB review. In view of the complexity of the database and requirements of Danish Law, interested investigators will be asked to form collaborative arrangement with the ProBe CaRe investigators rather than sharing data directly.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.