Article Text

Original research
Use of external control arms in immune-mediated inflammatory diseases: a systematic review
  1. Alexa Zayadi1,
  2. Robert Edge1,
  3. Claire E Parker1,
  4. John K Macdonald1,
  5. Blue Neustifter1,
  6. Joshua Chang1,
  7. Guowei Zhong1,
  8. Siddharth Singh2,3,
  9. Brian G Feagan1,4,5,
  10. Christopher Ma1,6,7,
  11. Vipul Jairath1,4,5
  1. 1Alimentiv Inc, London, Ontario, Canada
  2. 2Division of Gastroenterology, University of California, San Diego, La Jolla, California, USA
  3. 3Division of Biomedical Informatics, University of California, San Diego, La Jolla, California, USA
  4. 4Department of Medicine, Division of Gastroenterology, Western University, London, Ontario, Canada
  5. 5Department of Epidemiology and Biostatistics, Western University, London, Ontario, Canada
  6. 6Division of Gastroenterology and Hepatology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
  7. 7Department of Community Health Sciences, University of Calgary, Calgary, Alberta, Canada
  1. Correspondence to Dr Vipul Jairath; vjairath{at}uwo.ca

Abstract

Objectives External control arms (ECAs) provide useful comparisons in clinical trials when randomised control arms are limited or not feasible. We conducted a systematic review to summarise applications of ECAs in trials of immune-mediated inflammatory diseases (IMIDs).

Design Systematic review with an appraisal of ECA source quality rated across five domains (data collection, study populations, outcome definitions, reliability and comprehensiveness of the dataset, and other potential limitations) as high, low or unclear quality.

Data sources Embase, Medline and Cochrane Central Register of Controlled Trial were searched through to 12 September 2023.

Eligibility criteria Eligible studies were single-arm or randomised controlled trials (RCTs) of inflammatory bowel disease, pouchitis, rheumatoid arthritis, juvenile idiopathic arthritis, ankylosing spondylitis, psoriatic arthritis, psoriasis and atopic dermatitis in which an ECA was used as the comparator.

Data extraction and synthesis Two authors independently screened the search results in duplicate. The characteristics of included studies, external data source(s), outcomes and statistical methods were recorded, and the quality of the ECA data source was assessed by two independent authors.

Results Forty-three studies met the inclusion criteria (inflammatory bowel disease: 16, pouchitis: 1, rheumatoid arthritis: 12, juvenile idiopathic arthritis: 1, ankylosing spondylitis: 5, psoriasis: 3, multiple indications: 4). The majority of these trials were single-arm (33/43) and enrolled adult patients (34/43). All included studies used a historical control rather than a contemporaneous ECA. In RCTs, ECAs were most often derived from the placebo arm of another RCT (6/10). In single-arm trials, historical case series were the most common ECA source (19/33). Most studies (31/43) did not employ a statistical approach to generate the ECA from historical data.

Conclusions Standardised ECA methodology and reporting conventions are lacking for IMIDs trials. The establishment of ECA reporting guidelines may enhance the rigour and transparency of future research.

  • clinical trials
  • general medicine (see internal medicine)
  • rheumatology
  • psoriasis
  • inflammatory bowel disease

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

STRENGTHS AND LIMITATIONS OF THIS STUDY

  • This systematic review used a comprehensive predefined approach to identify published, interventional immune-mediated inflammatory disease studies in which external data have been used as the comparator.

  • Although we devised a sensitive search strategy comprising keywords and controlled vocabulary, the language used to describe external control arm (ECA) methodology lacks standardisation and ECA-specific MeSH terms do not exist.

  • Our assessment of ECA data source quality was conducted with an instrument that has not been validated to appraise ECA methodology, as no such instrument has been developed for use in the context of systematic reviews.

Background

Randomised controlled trials (RCTs) are the gold standard for evaluating the efficacy and safety of medical interventions.1 The strengths of randomisation include minimisation of differences of known and unknown confounders among experimental groups, reduced patient selection bias and facilitation of robust statistical analyses. The randomised controlled design also permits blinding, which mitigates potential bias introduced by study personnel, participants and outcome assessors. Despite these advantages, RCTs can present practical, logistical and ethical challenges.1 For example, it can be difficult to enrol a sufficient number of participants when the disease is rare; there are multiple competing drug development programmes, or trial avoidance where patients are concerned about being allocated to placebo. Furthermore, placebo-controlled RCTs may be unethical if there is a lack of clinical equipoise, or an existing well-established standard of care therapy available.

As noted by the US Food and Drug Administration (FDA)2 and the European Medicines Agency (EMA),3 there are instances where it may be acceptable to use external data to form a control arm, instead of a concurrently randomised internal control group. External control arms (ECAs) comprise participants who are not part of the same study as the group receiving the investigational agent.4 Both regulatory agencies have indicated that a well-characterised disease course, large anticipated treatment effect, sufficiently similar treatment and external populations, and the use of objective outcome measures are important factors in externally controlled designs.2 3 Patient data used to form the ECA can be collected during the same or similar time period (contemporaneous control) or derived from a previously treated patient population where retrospective or retrospectively analysed data are used as a comparator (historical control).2–7 These data are typically sourced from prior clinical trials, disease registries, cohort studies, electronic health records and claims databases. External data can also be used to augment the sample size of an internal control arm. The use of an ECA can reduce sample size requirements, avoid ethical pitfalls of using placebo and provides an opportunity to use historical control arm data which may be both valid and large.2 4–6 Yet, potential bias introduced by differences in study populations, trial design and outcome assessment along with the inability to control for unknown confounders and lack of blinding necessitates careful consideration of data source selection, analytical methods and missing data handling.

To date, ECAs have most frequently been used in clinical trials of oncology and rare diseases.8 Both the FDA and EMA have published guidance documents outlining recommendations for using real-world data and real-world evidence in regulatory decision-making,9–13 and key recommendations for the design and conduct of externally controlled clinical trials.2 Study designs involving comparisons with ECAs are increasingly being submitted to regulatory agencies, and explored in novel therapeutic areas.14 15 Between 2005 and 2017, 43 product applications for indications in haematology and rare diseases were submitted for regulatory approval, of which 98% and 79% were approved by the FDA and EMA, respectively.14 In an analysis of 45 FDA approvals of non-oncology products that relied on external controls, 20% of approved products were for the treatment of a common disease (ie, affecting more than 200 000 patients in the USA) and 91% were recognised as addressing an unmet medical need.15 Given the willingness of regulators to consider ECA data in clinical development programmes and practical issues inherent to RCT design, there is growing interest in exploring how to best adopt this methodology within novel contexts.

Conducting RCTs for immune-mediated inflammatory diseases (IMIDs), such as inflammatory bowel disease, can be challenging due to slow and competitive patient recruitment. Contributing factors include a large clinical development pipeline, the availability of several approved treatment options in routine clinical care, and increasingly stringent eligibility criteria, particularly with respect to the use of prior and concomitant therapies.16 In addition, the chance of exposure to placebo can be a deterrent for participation in clinical trials for both patients and investigators. The incorporation of ECA data into IMID trial design may provide an efficient alternative or supplement to a traditional randomised control arm. Currently, it is unclear how often and in what setting ECAs have been incorporated into IMID trials, and what data sources and statistical techniques have been used.

This systematic review aimed to identify and describe trials of IMIDs (ie, Crohn’s disease, ulcerative colitis, chronic refractory pouchitis, rheumatoid arthritis, juvenile idiopathic arthritis, ankylosing spondylitis, psoriatic arthritis, psoriasis and atopic dermatitis) that have used an ECA, and assess the methodological quality of these studies.

Methods

This systematic review is reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Statement17 and was conducted following an a priori developed protocol (available on request).

Search strategy

Embase, Medline and the Cochrane Central Register of Controlled Trials were searched without language restriction from database inception to 12 September 2023, using predefined strategies (online supplemental appendix 1). A semiautomated recursive search of the bibliographies of relevant publications, including review articles and meta-analyses, was also performed using the Systematic Review Accelerator SpiderCite tool (Bond University, Gold Coast, Qld, Australia).

Eligibility criteria

Eligible studies: (1) compared data from a randomised or single-arm clinical trial with a control arm from an external source (ie, previous clinical trials, disease registries, commercial insurance or national health insurance claims databases, or electronic health records); (2) enrolled adults or children with Crohn’s disease, ulcerative colitis, chronic refractory pouchitis, rheumatoid arthritis/juvenile idiopathic arthritis, ankylosing spondylitis, psoriatic arthritis, psoriasis, or atopic dermatitis; and (3) assessed a medical intervention. The ECA comprised patients from placebo, standard of care, or active comparator groups. Trials were eligible for inclusion regardless of whether the ECA was prospectively or retrospectively integrated into the study design. Protocols for upcoming trials were considered, in addition to studies published in full text or abstract form. Studies were excluded if they were non-interventional or if the active arm data were not collected as part of a clinical trial. Additionally, we excluded studies that did not report the ECA dataset source.

Screening and data extraction

Search results were independently screened in duplicate by two authors (AZ and JC) and data extraction was completed by two independent authors (AZ and RE) using a standardised spreadsheet. Data collected from each study included: the publication year, study design, trial phase, indication, study arms, number of participants, inclusion/exclusion criteria, interventions, source of ECA data, statistical methods, outcomes assessed, outcome type (dichotomous vs continuous) and rationale for using an ECA. Disagreements encountered during screening and data extraction were resolved by consensus or, if consensus could not be reached, through arbitration by a third author.

Quality assessment

The methodological quality of the ECA data source was assessed by two authors (AZ and RE) using a checklist adapted from Thorlund et al.18 The checklist comprised 5 key domains: (1) quality of data source; (2) similarly of the study populations; (3) similarities of the outcome definitions; (4) the reliability and comprehensiveness of the dataset; and (5) other potential limitations. Key judgement criteria are reported in online supplemental table 1. Each domain was classified as being of high (all criteria are met), low (no criteria are met) or unclear quality (some criteria are met and/or insufficient information to ascertain whether all criteria are met).

Patient involvement

No patients were involved in the study concept and design, acquisition and interpretation of data, or drafting of the manuscript.

Results

Search results

A total of 2192 records were identified from database and recursive searching, from which 610 duplicates were removed. Based on the information provided in the titles and abstracts of the remaining 1582 records, 1442 were excluded as non-applicable. Full-text review was performed for 140 records, of which 97 were excluded with reasons (online supplemental table 2). Forty-three studies met the eligibility criteria and were included (figure 1).

Figure 1

Study flow diagram of study inclusion. Adapted from Page et al.17

Randomised controlled trials

Characteristics of included studies

Ten of the 43 included studies (23%) compared the active arm(s) of an RCT to an ECA (table 1).19–28 Five of the included RCTs (5/10; 50%) prospectively leveraged external control data as part of the study design.19–23 Four of these studies (4/5; 80%) randomised participants to 1 of 2 active arms and used an ECA to completely replace the function of a placebo arm, while 1 (1/5; 20%) was a placebo-controlled RCT that incorporated external data to augment the sample size of the placebo group. Among the five RCTs that prospectively used external data were three phase 320 22 23 (60%) and one phase 219 (20%) trials. The remaining five RCTs (5/10; 50%) retrospectively compared data from the active arm(s) of prior (4/5; 80%) or ongoing (1/5; 20%) phase 1–3 trials with an ECA.24–28

Table 1

Characteristics of included randomised controlled trials

Most of the included RCTs enrolled adults (7/10; 70%), with 3 of the 10 trials (30%) conducted in paediatric populations. Study populations comprised patients with inflammatory bowel disease (4/10; 40% (ulcerative colitis: 2/10; 20%, Crohn’s disease: 2/10; 20%)), rheumatoid arthritis (4/10; 40%), psoriasis (1/10; 10%), and ankylosing spondylitis (1/10; 10%). Among these studies, biologic therapies (6/10; 60%) were the most common intervention under investigation, including secukinumab (3/10; 30%), infliximab (1/10; 10%), adalimumab (1/10; 10%) and tesnatilimab (1/10; 10%). Two out of 10 studies (20%) evaluated small molecules (tofacitinib (1/10; 10%) and upadacitinib (1/10; 10%)). Treatment strategies were assessed in two RCTs (tight control strategy (1/10; 10%) and stringent treat-to-target strategy (1/10; 10%)).

The majority of studies (6/10; 60%) made comparisons with placebo data derived from 1 or more previously conducted RCTs. An external active control group was used by 2 out of 10 studies (20%), with data sourced from a disease registry (1/10; 10%) or electronic health records (1/10; 10%). Lastly, 2 studies (2/10; 20%) formed a standard of care control group using data from a historical case series (1/10; 10%) or a prior prospective cohort study (1/10; 10%).

External control methodology

All 10 studies used a historical ECA; we did not identify any studies that employed contemporaneously collected external data. Various methods were used to generate ECAs among the 10 studies that compared the active arm(s) of an RCT to external control data (table 2). Statistical methods used to adjust the external dataset included propensity score matching (2/10; 20%), propensity score weighting (1/10; 10%) and Bayesian dynamic borrowing (2/10; 20%). One study did not specify the weighting method employed (1/10; 10%). Four studies (4/10; 40%) did not make any statistical adjustments to ECA data to balance characteristics between active and external arms. Nine out of 10 studies (90%) made comparisons between the active arm and ECA for at least one efficacy outcome. In all cases, efficacy outcomes were defined by an indication-specific disease activity index and were assessed as dichotomous (9/9; 100%) or continuous (1/9; 11.1%) endpoints. Two studies (2/10; 20%) used external data to contextualise safety outcomes (ie, adverse events and serious adverse events).

Table 2

External control arm methodology in randomised controlled trials

Indications for use of an ECA

Three of the included RCTs (3/10; 30%) reported that comparisons to external data were made due to ethical issues associated with randomising participants to a placebo arm (table 2).20 22 23 All were phase 3 trials in paediatric populations, with 2 of these studies (2/3; 66.7%) adopting a study design that included an ECA following consultations with regulatory agencies.22 23 One study (1/10; 10%) additionally cited poor study recruitment due to non-acceptance of an internal control in the paediatric setting as a reason for incorporating external data.22 23 Augmenting the small sample size of the internal placebo group was provided as the rationale for introducing external placebo data into the primary analysis in 1 of the included RCTs (1/10; 10%).19

Single-arm studies

Characteristics of included studies

The majority of the included studies (33/43; 77%) were single-arm trials that relied on external data to provide context to treatment outcomes (table 3).29–61 Among these were one phase 2a trial (1/33; 3%), one phase 2b trial (1/33; 3%), one phase 2–3 trial (1/33; 3%), four phase 3 long-term RCT extensions (4/33; 12%) and 4 post-marketing surveillance studies (4/33; 12%).

Table 3

Characteristics of included single-arm studies

Most of the included single-arm trials enrolled adult participants (27/33; 81%), with some studies conducted in paediatric (4/33; 12%) and mixed-age (2/33; 6%) populations. Therapies for the treatment of inflammatory bowel disease (13/33; 40% (ulcerative colitis: 5/33; 15%, Crohn’s disease: 1/33; 3%)), rheumatoid arthritis and juvenile idiopathic arthritis (9/33; 27%), ankylosing spondylitis (4/33; 12%), psoriasis (2/33; 6%), and chronic refractory pouchitis (1/33; 3%) were evaluated. Four of the 33 single-arm trials (12%) administered treatment to a study population comprising participants with differing IMIDs. Pharmacological interventions were most frequently assessed (21/33; 64%). These included biologics (9/33; 31%), biosimilars (4/33; 12%), corticosteroids (2/33; 6%), immunosuppressives (3/33; 9%), antibiotics (1/33; 3%) and 5-aminosalicylic acid (1/33; 3%). The remaining 12 single-arm trials investigated a dosing strategy (4/33; 12%), treatment algorithm (2/33; 6%), surgical procedure (3/33; 9%), rapid infusion protocol (2/33; 6%), or faecal microbiota transplantation (1/33; 3%).

Unpublished historical case series of patients previously treated at the same site(s) as active arm participants was the most common ECA source (17/33; 52%) (table 3). External data were also derived from 1 or more prior phase 3–4 RCTs (6/33; 18%), product package inserts (2/33; 6%), published case series (2/33; 6%), disease registries (2/33; 6%), administrative claims databases (2/33; 6%), and previously conducted cohort studies (2/33; 6%). Two studies (2/33; 6%) pooled participant data from several different types of sources. The ECAs formed in single-arm trials comprised participants who had received standard of care (20/33; 61%), an active control therapy (10/33; 30%), or placebo (3/33; 9%).

External control methodology

All 33 included single-arm trials used a historical control, rather than a contemporaneous ECA. In most studies, external datasets were not statistically adjusted for known covariates to improve the comparability of external and target study populations (26/33; 79%) (table 4). Five of the 33 studies applied a matching methodology (15%), including propensity score matching (2/33; 6%), and 1 study employed a propensity score weighting (3%) approach to improve the comparability of active and external groups. Twenty-four of the included single-arm trials (73%) evaluated one or more efficacy outcomes. These studies assessed dichotomous (20/33; 60%) and continuous (7/33; 21%) efficacy endpoints based on an indication-specific disease activity index (18/33; 54.5%). Safety outcomes were compared with external data in 15 studies (45.4%). These included the incidence of adverse events (10/33; 30%), adverse infusion reactions (3/33; 9%), postsurgical complications (1/33; 3%) and serious adverse events (1/33; 3%).

Table 4

External control arm methodology in single-arm studies

Indication for use of an ECA

In 9 out of 33 single-arm trials (27%),32 36 37 43–46 51 52 it was considered unethical to randomise participants to an internal control arm due to the availability of approved therapies with established efficacy and safety (7/33; 21%), long-term assessment of treatment outcomes (6/33; 18%), or paediatric study population (2/33; 6%) (table 4). Six trials (6/33; 18%)33 35 40 42 55 57 cited practical challenges as the rationale for using an ECA. In five of these cases (5/33; 15%), an adequately powered RCT would reportedly not have been feasible due to the relatively small patient population available for study participation.

Quality assessment

The results of the methodological quality assessment are reported in online supplemental file 1. Of the 210 (43 studies × 5 domains) ratings, 99 (46%) were high quality, 113 (52%) were unclear quality and 3 (1%) were low quality with respect to the source used to generate the ECA. The domains that were rated for each study included the quality of the data source (40% high quality, 60% unclear quality), similarity of study populations (43% high quality, 57% unclear quality), similarity of outcome definitions (72% high quality, 21% unclear quality, 7% low quality), and reliability and comprehensiveness of the external dataset (21% high quality, 79% unclear quality).

Discussion

In recent years, there has been increased interest among researchers, regulators and industry partners in using external control data to address the practical limitations associated with traditional RCT designs. This systematic review aimed to summarise how ECAs are currently being integrated into studies of Crohn’s disease, ulcerative colitis, chronic refractory pouchitis, rheumatoid arthritis, juvenile idiopathic arthritis, ankylosing spondylitis, psoriatic arthritis, psoriasis and atopic dermatitis, and examine the methodological quality of this research. We identified 43 trials in which an ECA was used to contextualise the efficacy or safety of a medical therapy. No trials of atopic dermatitis were identified. Most of the included clinical trials were single-arm studies, and all used historical control data to form an ECA. Among included RCTs, external data were most often sourced data from multiple previously conducted RCTs. Conversely, external controls comprised participants who previously received treatment at the same site as the active arm population was the most frequent source of the ECA when the clinical trial was a single-arm design. In most cases, no adjustment methods were used to balance baseline characteristics between the treatment arm and historical data used to form the ECA. We did not identify any specific characteristics of data sources, outcome measure selection, analysis method or ECA type that were associated with a positive or negative study result.

All primary phase 3 trials of IMIDs identified in this systematic review that were submitted to or relied on by regulatory agencies for decision-making were in the paediatric setting (4/43; 9%), where an internal placebo group may have been deemed unethical by stakeholders. In a phase 3 trial that supported FDA and EMA approval of infliximab in paediatric ulcerative colitis, Hyams et al20 created a historical control group using pooled placebo data from prior infliximab RCTs in adults. A positive result was determined by using the upper limit of the 95% CI for the primary endpoint in the adult trials to define the lower 95% CI threshold in the paediatric trial. Croft et al22 assessed the efficacy and safety of adalimumab in a phase 3 study of paediatric ulcerative colitis. This study encountered recruitment delays due to poor acceptance of the placebo group and, following consultation with the FDA and EMA, a meta-analysis of historical adult placebo groups was used to replace the internal control. Adalimumab subsequently received regulatory approval in this patient population.22 In indications where historical paediatric data may be limited, leveraging prior adult placebo data may be acceptable and advantageous to shorten the time between regulatory approval of advanced therapies in adult and paediatric patients.

In a phase 3 study that supported FDA and EMA approval of secukinumab in paediatric psoriasis, Magnolo et al23 adopted a Bayesian dynamic borrowing approach using historical adult and paediatric placebo data from previous RCTs. Low-dose and high-dose secukinumab were found to be superior to historical placebo with respect to the coprimary and key secondary outcomes, with an estimated probability of a positive treatment effect of 100% compared with external placebo for both dosing regimens. Finally, Horneff et al44 conducted a single-arm phase 3b study to expand EMA approval to three under-studied categories of juvenile idiopathic arthritis (extended oligoarticular juvenile idiopathic arthritis, enthesitis-related arthritis, and psoriatic arthritis). Given the paediatric population, and well-established efficacy and safety of etanercept in polyarticular-course juvenile idiopathic arthritis, it was deemed unethical to randomise participants to receive placebo. Although etanercept response rates were statistically superior compared with the ECA, external placebo data were primarily sourced from a meta-analysis of paediatric RCTs that had been published nearly a decade prior.

If the historical ECA and target study sample substantially differ due to population drift, changes in eligibility criteria, or evolving standards of care, treatment effects may be misestimated. The primary method for mitigating estimation bias is careful selection of the studies from which the historical ECA is drawn. Statistical methods can also be used to adjust for imbalances between historical and target study samples. Propensity score matching involves performing regression on a set of prespecified baseline covariates to determine a propensity score for each participant. This enables the creation of matched sets of historical and target study participants with similar propensity score values.55 62 However, propensity score matching and other methods of balancing that use prespecified covariates have important limitations. First, confounders must be correctly identified and accurately measured. When matching is based on observed covariates, the historical ECA may not be representative of the parent population and therefore not suitable for comparison with the target study.62 Second, propensity score matching may result in ‘pruning’, where individuals with extreme covariate values have no suitable matches and are systematically excluded from the ECA, which can increase imbalance and bias.63 With Bayesian dynamic borrowing, data are weighted based on the level of discrepancy between the historical control and target study.23 64 When correctly performed, dynamic borrowing can provide additional information and augment sample size with a sufficient degree of scepticism to reduce bias. As with propensity score matching, dynamic borrowing requires correct specification of potential confounders to determine whether the ECA and target study data are closely matched. Evidence suggests that dynamic borrowing may carry an inflated chance of type I error; however, the benefits of additional power and precision may outweigh this risk.65

When evaluating the data collection process, similarity of the populations, similarity of the outcome definitions, the reliability and comprehensiveness of the datasets, and other limitations, only 3 of the 215 items (1%) assessed were rated as low quality. However, half of the items (113/215; 52.5%) could not be assigned a definitive low-quality or high-quality rating due to insufficient detail reported in the study publication. The included studies frequently failed to describe in detail the eligibility criteria and baseline characteristics of the active arm and ECA, outcome definitions, covariates, whether there were missing data, and the analytical methods employed to balance the cohort.

Comprehensive reporting is therefore necessary to critically appraise the quality of ECA studies, particularly when real-world data are used. This is underscored by the external control quality checklist proposed by Thorlund et al18 and in draft guidance by the FDA and EMA which state that adequate documentation during data curation and transformation is essential to increase confidence in the resultant data.11 13 The FDA has additionally issued draft guidance outlining key considerations for designing externally-controlled clinical trials, selecting fit-for-use ECA data sources, analysing comparisons and accounting for potential bias.2 While this document does not recommend specific statistical methodologies for making comparisons to external data, it is noted that underlying assumptions should be identified and further examined using sensitivity analyses and model diagnostics. The need to assess the impact of missing and misclassified data in the ECA, which may be more common when using real-world data sources, is emphasised.66

Moreover, objective and reliable measurements for the data of interest are recommended to mitigate bias due to misclassified data and lack of blinding. Within the context of IMIDs, this may involve restricting outcome measures to the most objective item(s) included in composite disease active indices. For example, centrally read endoscopic improvement, as defined by the Mayo Endoscopic subscore, was selected as the primary outcome in a statistically significant study by Danese et al55 evaluating a novel biologic in a phase 2 ulcerative colitis trial. Trial patients were matched with historical placebo patients from prior RCTs using propensity score analysis, and placebo endoscopic response rates were used to determine the null hypothesis.

Regulatory submissions are evaluated on a case-by-case basis, and approvals are often conditional on additional evidence from postapproval studies.14 Our results suggest that comprehensive ECA guidelines may be particularly beneficial for IMID research. The patient populations, standards of care, outcome definitions and study timepoints have substantially evolved over time in IMIDs, and all studies identified in the current systematic review included historical controls. Ensuring that appropriate source data and optimal statistical methodologies are selected may therefore be of particular importance within this context.

Our study has some important strengths. We examined how external controls have been implemented in IMID trials using a predefined systematic approach. To our knowledge, this is the first systematic review on the use of ECAs in these populations. This qualitative summary may help inform ECA data source selection in future externally controlled trials of IMIDs and help in trial design where a treatment arm might be useful in the future as an ECA. However, the limitations of this study should also be acknowledged. First, while we aimed to devise a highly sensitive search strategy using text words and controlled vocabulary, the language used to describe ECA methodology is varied, and ECA-specific MeSH terms do not exist. In an effort to ensure that all eligible studies were identified, bibliographies of relevant review articles were also searched. Nine of the 2192 records screened were identified using this approach. Furthermore, only published studies were searched and included in this systematic review. Second, the quality of included studies was rated using an instrument that was not developed and validated to appraise ECA methodology, as no such instrument currently exists. Our quality assessment exclusively focused on the source of the external control data. We did not evaluate the methods used to generate external controls given that trials were eligible for inclusion regardless of whether they used trial-level or patient-level data.

In conclusion, we found that external control data have been applied in a variety of IMID settings to contextualise efficacy and safety outcomes. The reporting of the methodology used to generate and analyse ECAs has been incomplete and heterogeneous. The establishment of authoritative reporting guidelines may serve as a catalyst for transparent reporting and rigorous study design for ECA studies in IMIDs.

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

Ethics statements

Patient consent for publication

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Contributors Study concept and design: VJ. Acquisition, analysis and interpretation of data: AZ, CP, JC, JKM, and RE. Draft of initial manuscript: AZ, BN, CP and RE. Critical revision of the manuscript for important intellectual content: AZ, BGF, BN, CP, CM, GZ, JC, JKM, RE, SS and VJ. Guarantor of article: VJ. All authors have made substantial contributions to the manuscript and approved the final version of the manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests AZ is an employee of Alimentiv. RE has received consulting fees from Alimentiv, CADTH, SpringWorks Therapeutics, and Turnstone Biologics. CEP is an employee of Alimentiv. JKM is an employee of Alimentiv. JC is a former employee of Alimentiv. BN is an employee of Alimentiv. GZ is an employee of Alimentiv. SS has received personal fees from Pfizer for ad hoc grant review, and his institute has received research grants from AbbVie and Janssen. BFG has received grant/research support from AbbVie, Amgen, AstraZeneca/MedImmune, Atlantic Pharmaceuticals, Boehringer-Ingelheim, Celgene Corporation, Celltech, Genentech Inc./Hoffmann-La Roche, Gilead Sciences, GSK, Janssen Research & Development, Pfizer, Receptos/Celgene International, Sanofi, Santarus, Takeda Development Center Americas, Tillotts, and UCB; consulting fees from AbbVie, Akebia Therapeutics, Allergan, Amgen, AMT, Aptevo Therapeutics, Astra Zeneca, Atlantic Pharma, Avir Pharma, Biogen Idec, BioMx Israel, Boehringer-Ingelheim, Bristol-Myers Squibb, Calypso Biotech, Celgene, Elan/Biogen, EnGene, Ferring Pharma, Roche/Genentech, Galapagos, GiCare Pharma, Gilead, Gossamer Pharma, GSK, Inception IBD, Johnson & Johnson/Janssen, Kyowa Kakko Kirin Co, Lexicon, Lilly, Lycera BioTech, Merck, Mesoblast Pharma, Millennium, Nestle, Nextbiotix, Novo Nordisk, Pfizer, Prometheus Therapeutics and Diagnostics, Progenity, Protagonist, Receptos, Salix Pharma, Shire, Sienna Biologics, Sigmoid Pharma, Sterna Biologicals, Synergy Pharma, Takeda, Teva Pharma, TiGenix, Tillotts, UCB, Vertex Pharma, Vivelix Pharma, VHsquared, and Zyngenia; speaker’s bureau fees from Abbott/AbbVie, JnJ/Janssen, Lilly, Takeda, Tillotts, and UCB Pharma; is a scientific advisory board member for Abbott/AbbVie, Allergan, Amgen, Astra Zeneca, Atlantic Pharma, Avaxia Biologics, Boehringer-Ingelheim, Bristol-Myers Squibb, Celgene, Centocor, Elan/Biogen, Galapagos, Genentech/Roche, JnJ/Janssen, Merck, Nestle, Novartis, Novo Nordisk, Pfizer, Prometheus Laboratories, Protagonist, Salix Pharma, Sterna Biologicals, Takeda, Teva, TiGenix, Tillotts, and UCB; and is the Senior Scientific Officer of Alimentiv. CM has received consulting fees from AbbVie, Alimentiv, Amgen, AVIR Pharma, BioJAMP, Bristol Myers Squibb, Celltrion, Ferring, Fresenius Kabi, Janssen, McKesson, Mylan, Takeda, Pendopharm, Pfizer, Prometheus Biosciences, Roche, Sanofi; speaker's fees from AbbVie, Amgen, AVIR Pharma, Alimentiv, Bristol Myers Squibb, Ferring, Fresenius Kabi, Janssen, Organon, Pendopharm, Pfizer, Takeda; royalties from Springer Publishing; research support from Ferring, Pfizer. VJ has received has received consulting/advisory board fees from AbbVie, Alimentiv, Arena pharmaceuticals, Asahi Kasei Pharma, Asieris, Astra Zeneca, Bristol Myers Squibb, Celltrion, Eli Lilly, Ferring, Flagship Pioneering, Fresenius Kabi, Galapagos, GlaxoSmithKline, Genentech, Gilead, Janssen, Merck, Metacrine, Mylan, Pandion, Pendopharm, Pfizer, Protagonist, Prometheus, Reistone Biopharma, Roche, Sandoz, Second Genome, Sorriso pharmaceuticals, Takeda, Teva, Topivert, Ventyx, Vividion; speaker’s fees from, Abbvie, Ferring, Bristol Myers Squibb, Galapagos, Janssen Pfizer Shire, Takeda, Fresenius Kabi.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.