Purpose The currently ongoing Epidemiological Strategy and Medical Economics (ESME) research programme aims at centralising real-life data on oncology care for epidemiological research purposes. We draw on results from the metastatic breast cancer (MBC) cohort to illustrate the methodology used for data collection in the ESME research programme.
Participants All consecutive ≥18 years patients with MBC treatment initiated between 2008 and 2014 in one of the 18 French Comprehensive Cancer Centres were selected. Diagnostic, therapeutic and follow-up data (demographics, primary tumour, metastatic disease, treatment patterns and vital status) were collected through the course of the disease. Data collection is updated annually.
Finding to date With a recruitment target of 30 000 patients with MBC by 2019, we currently screened a total of 45 329 patients, and >16 700 patients with a metastatic disease treatment initiated after 2008 have been selected. 20.7% of patients had an hormone receptor (HR)-negative MBC, 73.7% had a HER2-negative MBC and 13.9% were classified as triple-negative BC (ie, HER2 and HR status both negative). Median follow-up duration from MBC diagnosis was 48.55 months for the whole cohort.
Future plans These real-world data will help standardise the management of MBC and improve patient care. A dozen of ancillary research projects have been conducted and some of them are already accepted for publication or ready to be issued. The ESME research programme is expanding to ovarian cancer and advanced/metastatic lung cancer. Our ultimate goal is to achieve a continuous link to the data of the cohort to the French national Health Data System for centralising data on healthcare reimbursement (drugs, medical procedures), inpatient/outpatient stays and visits in primary/secondary care settings.
Trial registration number NCT03275311; Pre-results.
- real-world data
- data platform
- patient medical record
- quality control
- metastatic breast cancer
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
- real-world data
- data platform
- patient medical record
- quality control
- metastatic breast cancer
Strengths and limitations of this study
The Epidemiological Strategy and Medical Economics research programme aims at centralising real-life data on oncology care for epidemiological research purposes. The ongoing screening of the metastatic breast cancer (MBC) cohort reached >16 700 patients with metastatic disease treatment initiated after 2008, currently contributing to the development of one of the most important cohort of patients with treated MBC.
Screening process of patients and data collection (diagnostic, therapeutic and follow-up data) through the course of the disease provide a solid base of knowledge for real-world survival. The significant resources deployed allowed to achieve a high quality-level data validation, including systematic consultancy of the source folder for data collection, and the implementation of effective quality control, and regular audit.
The main limitations are (1) the lack of availability of electronic medical records data required to describe the global MBC management due to the low level of standardisation of current electronic medical records and (2) the retrospective patient selection—data collection, notwithstanding the prospective compilation of real-life follow-up, clinical and biological events, preventing to assess several endpoints classically defined for randomised clinical trials such as progression disease at predefined time points.
Despite the current large-scale recruitment of patients and greater than one-third of French patients with MBC managed in French Comprehensive Cancer Centres, future studies should integrate the diversity of management options adopted in any health institutions.
Real-world evidence (RWE) studies and observational studies using real-world data (RWD) play a growing role in conducting comparative effectiveness research on pharmaceutical products and other healthcare interventions. They aim to bridge the gap between the highly controlled environment of randomised clinical trials (RCTs) and real-life clinical practice.1 In particular, health authorities are interested in gathering RWD for long-term benefit–risk assessment and, increasingly, health economic evaluations for reimbursement decisions.2 With their high internal validity, RCTs are considered the gold standard of evidence for establishing treatment efficacy, although the generalisability of findings to clinical practice may be limited.3 In fact, cancer survival endpoints in the real-life setting may differ from that measured in traditional RCTs.4 Furthermore, other limitations such as short follow-up and small sample size have created uncertainty in estimating survival criteria in RCTs, and thus surrogate endpoints are generally used such as progression-free survival (PFS) or time to progression.5 On the other hand, large population-based cohorts with longer follow-up periods can be particularly appropriate to assess long-term clinical outcomes, such as overall survival (OS) outside of the RCT setting6 and to detect changes in medical practice.
RWE and observational studies are non-experimental research where cancer management (treatments and disease evolution assessment) is left to the choice of care providers and patients.7 Over the past few years, a broad consensus arose around the requirement for high-quality data from large cohorts to strengthen and drive improvements in research methods and practices.8 The goal of these studies is to generate complementary information to RCTs based on larger samples and provide answers regarding particular populations in real-life clinical practice. These studies are more prone to biases such as baseline differences between patients (selection bias) or bias due to confounding by indication for example. To minimise sources of such biases, statistical approaches including adjusted analyses, propensity score methods or instrumental variables may be also employed.9–12
Metastatic breast cancer (MBC) is one of the leading causes of cancer-related mortality among women in Western countries.13 A relatively high proportion (approximately 30%) of patients with breast cancer develop metastatic disease,14 and while significant treatment advances have been made, the overall prognosis is poor with a 5-year survival rate of 25%.15 The national academic network of cancer centres in France (French Comprehensive Cancer Centres (FCCCs)), which together handle over one-third of all breast cancer cases nationally, decided to launch in 2014 a programme dedicated to focusing on RWD in oncology databases in MBC through Epidemiological Strategy and Medical Economics (ESME) research programme.
The ESME research programme aims to build a comprehensive database on oncology care for epidemiological research purposes to improve knowledge on medical practice in real-life setting, on public health and healthcare use, and to provide information to health authorities and other associated bodies.
Several studies based on real-life data collection have been developed in this programme. Different cohorts of patients with ovarian cancer and patients with advanced/metastatic lung cancer are currently recruiting.
Some results from the ongoing ESME MBC project (ClinicalTrials.gov NCT03275311) whose aim is to describe the medical care of patients treated for MBC according to their disease characteristics the evolution of their metastatic disease and their outcomes have been recently published. Recent analyses explored OS in different subgroups of patients with MBC, and first-line therapy in patients with HER2-negative MBC, and in patients with hormone receptor-positive HER2-negative MBC.16–18
In this paper, we describe the methodological principles that underpin the ESME research programme, and illustrate this innovative approach through this first ESME MBC cohort. Design, brief description of the selected population and current status are reported.
The ESME MBC cohort is a population-based registry in 18 FCCCs (http://www.unicancer.fr/en/rd-unicancer/esme), which collected data on all consecutive patients treated for MBC from 2008. Annual data collection phases are planned to add new diagnosed cases and update patients’ follow-up data.
Patients eligible to the ESME MBC cohort were male or female patients aged ≥18 years with MBC whose first metastasis had been either completely or partially treated between 1 January 2008 and 31 December 2014 in one of the FCCCs. MBC treatment could include radiotherapy, chemotherapy, targeted therapy, immunotherapy or endocrine therapy.
Patient screening process
Patient screening process involved two steps: an automated case screening followed by the validation of selection for each screened case. The automated case screening was based on information retrieved from multiple data sources available within each FCCC: administrative records (French National Computerised Medical Information System) (via MBC-specific International Classification of Diseases (ICD) codes associated with inpatient stays), pharmacy records, patient medical records (PMRs), including multidisciplinary team meeting records and search using relevant keywords, MBC-specific registries.
The objective of the first step—automated screening step—was to identify all cases with inpatient stays or therapeutic management for MBC in one of the FCCCs during the selection period and generate the patient screening list. The ICD codes used were C50 (Malignant neoplasm of breast), C77.- (except C77.3) (Secondary and unspecified malignant neoplasm of lymph nodes), C78.- (Secondary malignant neoplasm of respiratory and digestive organs) and C79.- (Secondary malignant neoplasm of other and unspecified sites). Once the patient screening list was finalised in each centre, data were subsequently anonymised and each patient had been assigned an ESME number. The first screening step did not allow to precisely identify the date for first MBC-specific treatment, and the second step was performed to cross-check eligibility criteria for all screened cases and specify the dates related to the initiation of the MBC first-line treatment, using data from the PMRs.
Patient data protection
The ESME research programme was managed by R&D Unicancer in accordance with guidelines for Good Pharmacoepidemiology Practices and Good Epidemiology Practices.19 20
Ancillary projects analyses were notified to an independent ethics committee (Lyon Sud-Est II) on 17 December 2015.
The data of the selected patients were planned to be collected by trained research assistants and annually updated. The data collection was performed in two phases from October 2014 to October 2016. A first data collection phase conducted in 2014–2015 collected the data from patients with MBC treatment initiated between 1 January 2008 and 31 December 2013 in one of the FCCCs. A second phase of data collection performed in 2016 added to the ongoing database the data from patients with MBC treatment initiated between 1 January and 31 December 2014, and follow-up data for the global cohort were consequently updated. Hence, the ongoing database provides an overview of all the data from patients with an MBC treatment initiated from 2008; information is updated with last contact information available in the PMR at the date of the data collection.
Baseline and follow-up data
The ESME MBC cohort is composed of three types of data set (figure 1):
Patient-related data are obtained from systematic review of patient medical records (non-structured data) and provide information on patient demographics, cancer family history, characteristics of primary tumour, relapses, metastatic recurrences pathological reports (tumour size, grade, histological type), hormone-receptor status and HER-2 status, therapeutic care (focusing on cancer-related treatment) and settings, reasons for treatment termination and clinical events.
Hospitalisation records are integrated data from a structured and automated database related to inpatient stays, and primarily used to bill the French National Health Insurance Fund (Assurance Maladie). It provides information on patient entry and discharge information (date and destination code at discharge), ICD codes associated to each stay, diagnostic, medical and therapeutic procedures including radiotherapy and surgery.
Pharmacy-dispensed treatment records includes all data related to anticancer treatments obtained from each centre’s pharmacy database: drugs (International Non-proprietary Name), administration date, protocol name, patient’s height and body mass index, cycle ID, common unit of delivery (UCD) code (pharmaceutical form related to the drug dosage), line of treatment and administration in a clinical trial (yes/no). It exclusively includes information on products that are prescribed by each FCCC.
The detailed raw data collected above and derived data are listed in table 1.
Any data integrated in the ESME research programme are subject to quality control (QC) procedures.
Patient-related data are registered via an electronic case report form (eCRF) in each centre by trained clinical research assistants (CRAs) between December 2014 and October 2016. Medical support to assist CRAs was provided to ensure QC data and appropriate recoding was performed before annual database lock when required. All ESME procedures are handled according to the Guidelines for Good Pharmacoepidemiology Practices.20 Importantly, all data are exclusively obtained retrospectively; no attempts are made to recover unavailable data from PMRs by contacting healthcare providers or patients.
The clinical data management system used was SAS software V.9.4. For both, ESME MBC database and the eCRF tool administration used an Oracle solution and certified personal data hosting system guarantee data security.
On-site quality review
On-site quality review of the patient screening process was carried out. This consisted of checking eligibility criteria on samples of selected and non-selected cases in each FCCC. For selected cases, key variables were crosschecked versus the source data. For all QC procedures, accepted error limits were 10% for non-selected patients and 5% for selected patients. A central audit was subsequently performed by the Unicancer Quality Assurance Department, and an audit on data registration and generated screening list was also conducted at the local level.
Data quality assurance
Three boards monitor the ESME Research programme: Scientific Committee, Deontology Committee and International Advisory Board. The main role of the Scientific Committee is to (1) ensure that the applicable scientific rules are followed, (2) evaluate any ancillary projects in compliance with defined criteria and scientific pertinence, and (3) monitor all the validated ancillary projects. The Deontology Committee monitors any potential conflicts of interest related to experts involved in the programme, gives recommendations that may improve the prevention of conflicts, provides opinions on individual or particular situations, and potential collaborations with private partners. The ESME International Advisory Board has a consultative role with regard to coherence of the scientific programme and reviews key international communications, formulates recommendations for publication rules or methodology and reinforces international academic cooperation.
Data analysis principles
Academic research teams or private organisations could propose ancillary projects.
For each accepted ancillary project, statistical analyses are conducted according to a detailed statistical analysis plan that must be reviewed by the scientific committee. This article does not aim at providing a comprehensive and exhaustive review on appropriate statistical methods to reduce bias related to analysis based on RWD.
A first ancillary project reported the outcomes (OS and PFS) following first-line paclitaxel treatment with or without bevacizumab.16 Other ongoing analyses of sensibility will better address the bias potentially found in real-life settings. Two analyses reported the description of OS in different subgroups of patients with MBC over the time, and results for the first-line therapy (endocrine therapy or chemotherapy) in hormone receptor-positive HER2 negative cancer subgroup, respectively.17 18 Other series accepted for communications (abstracts/posters) to major congress in 2017 reported epidemiological analyses (ie, impact of age at diagnosis, etc), therapeutic management (ie, impact of loco-regional treatment on OS, etc) and specific analyses for drug use in routine practice (ie, use of vinorelbine, everolimus, etoposide, etc).
Different other subpopulations are currently being considered such as subgroups of metastatic triple-negative breast cancer, metastatic HER2-positive MBC, etc.
Patient and public involvement statement
Patients and/or public were not involved.
Results and current status
In total, 16 711 out of the 45 329 patients screened were selected in the ESME MBC cohort (see figure 2).
The sensitivity and specificity of the three main screening sources used were explored. Sensitivity was highest for administrative records (78% vs43% for pharmacy records, and 28% for BC-specific local registries). On the other hand, specificity was highest for BC-specific local registries (87% vs67% for pharmacy records, and 49% for administrative records).
The main reasons for non-selection of screened patients were presence of non-MBC or other metastatic cancers (n=14 104 patients), initial MBC treatment received before 1 January 2008 (n=7486), and first metastasis not initially treated in an FCCC (n=4239). Nine additional were excluded prior to the final database lock due to inconsistencies in the dates. A total of 16 702 patients were analysed (figure 2).
Table 2 summarises the main demographic and disease characteristics at the time of initial metastatic diagnosis. Patients were nearly all women (99.1%) with a median age of 61 years. Over half (56.2%) had at least visceral metastases present, with 30.2% having at least bone and non-visceral metastases, and 13.6% with skin only, or node only, or at least skin plus node. 20.7% of patients had an HR-negative MBC, 73.7% had a HER2-negative MBC and 13.9% were classified as triple-negative BC (ie, HER2 and HR status both negative).
Median follow-up duration from MBC initiation treatment was 48.55 months for the whole cohort (95% CI 47.7 to 49.38).
Retrospective analysis using RWD is likely to become increasingly important to ensure that medications are accepted by policymakers and adopted by patient practitioners. The ESME Research programme is a large-scale initiative to provide access to RWD in oncology. This ongoing ESME MBC cohort currently centralises data from 16 711 patients.
The ESME research programme provides a unique opportunity to study a diverse range of topics related to MBC care and management in real-life settings. Indeed, there are many potential applications, including study of the factors influencing patient care (eg, cancer and patient characteristics), description of therapeutic strategies (treatment lines and sequences of therapies, etc) and measurement of clinical events (disease progression, death, persistence of treatment effect). Therefore, this approach allows a better characterisation of patients enroled in clinical trials and contributes to simulation of trials using appropriate statistical methodologies. Potentially, these data could be used for health economics evaluation of management strategies for patients (eg, rehospitalisation and related ambulatory care), as well as reconstruction of healthcare trajectories through data modelling.
The ESME research programme includes alternative approaches to generate cohorts that use different types of RWD (clinical data, therapeutic treatment data, long-term outcomes, health economics data) in the FCCCs versus existing registries in France, Europe and the USA (eg, SEER). It involves rigorous procedures for patient screening and data collection, ensuring both validity and reliability of data. It uses a fully retrospective approach, with no influence on treatment practice or interaction with oncologists. Unlike prospective interventional or observational research studies, data are not influenced by study design and reflect the real-life management of patients treated. While data recorded for the cohort are defined by experts in the field, the vast majority of data are collected by trained clinical research technicians, thereby minimising any potential risk of data misinterpretation. As discussed above, the ESME MBC cohort offers a unique opportunity to study a wide range of research questions in a large sample. With respect to evaluation of treatment strategies, the database enables reliable estimation of survival criteria such as OS and surrogates endpoints (PFS, etc). OS improvement in diseases with a long median post-progression survival time, such as MBC, is a critical endpoint.21–23
The ESME MBC cohort also has several limitations. For example, the database relies on the collection and restructuring of existing data only, that is, there is no creation of new data. Furthermore, apart from events reported in the PMR impacting therapeutic management, adverse effects are not routinely captured. Conceivably, further in-depth analysis of the data could highlight trends such as treatment interruption or discontinuation due to toxicity, which is important from a risk management perspective. The main potential sources of bias include selection bias, and information bias due to differences in patient monitoring and non-standardised data collection. Selection bias has been taken into account by using rigorous selection procedures across all 18 FCCCs, and the data management plan and QC programme described above have been designed to limit information bias. Nevertheless, due to the retrospective data collection and the fact that it is based on real-life follow-up, clinical and biological events are not evaluated at predefined time points (unlike in RCTs). For example, objective response, historical endpoint in RCTs, could not be assessed retrospectively without a central review of existing imaging as not systematically documented in routine practice. The information collected therefore depends on the frequency of follow-up visits and clinical and radiological exams prescribed by the patient’s doctor. As clinical signs are the only means by which disease metastasis can be identified, the number of disease progressions may be underestimated. With respect to the clinical event of death, all deaths are reported in the PMRs.
With respect to evaluation of treatment strategies, analysis of real-life data poses unique challenges, such as accounting for confounding factors between patient groups, although various statistical approaches can be used to address this, as discussed above.24
Concerning overall generalisability and applicability (external validity), it should be noted that the cohort centralises data from patients treated in specialised cancer centres only. FCCCs may use different clinical practices compared with public hospitals and private institutions, and thus patients from FCCCs may not be truly representative of all French patients with breast cancer. Potentially, data extrapolation from all French healthcare organisations could be developed with the Exhaustive National Health Reimbursement System (Système national d’information inter-régimes de l’Assurance maladie).
The ESME MBC cohort aims to collect data for up to 30 000 patients by 2019. As mentioned, future aims might include to continuously link our database to those from other institutions, such as the SNDS database for data on exhaustive healthcare reimbursement, and the INSEE database to provide vital status updates for patients lost to follow-up. The ESME research programme has further expanded to ovarian cancer and advanced/metastatic lung cancer. RWD from the ESME cohorts should help to provide medical recommendations and ultimately improve patient care.25
The authors thank the 18 French Comprehensive Cancer Centres for providing the data and each ESME contact for coordinating the project at the local level. We thank the ESME Scientific Committee members for their ongoing support. They also acknowledge Dr Sarah Hopwood (Scinopsis, France) for editorial assistance with this publication, paid by R&D UNICANCER. The 18 French Comprehensive Cancer Centres (FCCC) are Ensemble hospitalier de l’Institut Curie, Paris/Saint-Cloud, Gustave Roussy, Villejuif, Institut de Cancérologie de l’Ouest, Nantes/Angers, Centre François Baclesse, Caen, Institut du Cancer de Montpellier Montpellier, Centre Léon Bérard, Lyon, Centre Georges-François Leclerc, Dijon, Centre Henri Becquerel, Rouen; Institut Claudius Régaud, Toulouse; Centre Antoine Lacassagne, Nice; Institut de Cancérologie de Lorraine, Nancy; Centre Eugène Marquis, Rennes; Institut Paoli-Calmettes, Marseille; Centre Jean Perrin, Clermont-Ferrand; Institut Bergonié, Bordeaux; Centre Paul Strauss, Strasbourg; Institut Jean Godinot, Reims; Centre Oscar Lambret, Lille. ESME central coordinating staff: Head of Research and Development: Christian Cailliot. Program director: Mathieu Robain. Data Managers: Irwin Piot and Olivier Payen. Project Managers: Coralie Courtinard, Tahar Guesmia and Gaëtane Simon. Project assistant: Esméralda Pereira. Software designer: Alexandre Vanni. ESME local coordinating staff: Patrick Arveux, Thomas Bachelot, Jean-Pierre Bleuse, Delphine Berchery, Mathias Breton, Stéphanie Clisant, Emmanuel Chamorey, Valérie Dejean, Véronique Diéras, Anne-Valérie Guizard, Anne Jaffré, Lilian Laborde, Agnès Loeb, Muriel Mons, Damien Parent, Geneviève Perrocheau, Marie-Ange Mouret-Reynier, Carine Laurent, Michel Velten.
Contributors DP and MR: conception and design, writing the first draft. PA, DB, A-VG and M-AM-R: conception and design, data acquisition. SM-P: conception and design, interpretation. EC and LC: data acquisition and statistical analysis. BA, SG and MC: statistical analysis. MB, SC, MM, VD, LL, CL, AL, DP and GP: data acquisition. MV: conception and design, data acquisition and statistical analysis. CC: conception and design. ME: statistical analysis and writing the first draft. GS: conception and design, statistical analysis and writing the first draft. All authors reviewed and approved the final version of the manuscript.
Funding This work was supported by R&D UNICANCER and initially funded by an industrial consortium (Roche, Pierre Fabre, Pfizer, and Astrazeneca), and support was subsequently provided by MSD, Eisai,and Daiichi Sankyo.
Disclaimer The funders did not have any input into maintenance of the database (project design, data collection, management) or analysis of data. They were not involved in the writing of the manuscript or decision to submit for publication.
Competing interests DP has received personal fees (honoraria and travel/accommodation expenses) from Laboratoire Roche, outside the submitted work. BA has received personal fees for board membership and for consultancy from Roche Pharma, outside the submitted work. SG has received personal fees for board membership from Celgene and for consultancy from Roche, outside the submitted work.
Patient consent Not required.
Ethics approval The ESME MBC database received approval from the French data protection authority (Commission Nationale de l’Informatique et des Libertés, authorisation no. 1704113).
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement We reported the methodology developed to collect and control the data of the large ESME programme and illustrated the methodology with data from the cohort of patients with metastatic breast cancer patients. Data collected are listed in table 1. The database of the ESME programme or the database of the MBC cohorts are currently not accessible. For any specific demand, please contact the corresponding author. Each demand will be examined on a case-by-case basis by the scientific committee.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.