Article Text

Original research
Systematic review of international Delphi surveys for core outcome set development: representation of international patients
  1. Alice Lee1,
  2. Anna Davies2,
  3. Amber E Young3,4
  1. 1 Academic Foundation Doctor, Department of Surgery and Cancer, Imperial College London, London, UK
  2. 2 Senior Research Fellow, Centre for Academic Child Health, University of Bristol, Bristol, UK
  3. 3 Consultant Paediatric Anaesthetist and Lead Children's Burns Research Centre, University Hospitals Bristol NHS Foundation Trust, Bristol, UK
  4. 4 Senior Research Fellow, Bristol Centre for Surgical Research, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
  1. Correspondence to Dr Amber E Young; amber.young1{at}


Objectives A core outcome set (COS) describes a minimum set of outcomes to be reported by all clinical trials of one healthcare condition. Delphi surveys are frequently used to achieve consensus on core outcomes. International input is important to achieve global COS uptake. We aimed to investigate participant representation in international Delphi surveys, with reference to the inclusion of patients and participants from low and middle income countries as stakeholders (LMICs).

Design Systematic review.

Data sources EMBASE, Medline, Web of Science, COMET database and hand-searching.

Eligibility criteria Protocols and studies describing Delphi surveys used to develop an international COS for trial reporting, published between 1 January 2017 and 6 June 2019.

Data extraction and synthesis Delphi participants were grouped as patients or healthcare professionals (HCPs). Participants were considered international if their country of origin was different to that of the first or senior author. Data extraction included participant numbers, country of origin, country income group and whether Delphi surveys were translated. We analysed the impact these factors had on outcome prioritisation.

Results Of 90 included studies, 69% (n=62) were completed and 31% (n=28) were protocols. Studies recruited more HCPs than patients (median 60 (IQR 30–113) vs 30 (IQR 14–66) participants, respectively). A higher percentage of HCPs was international compared with patients (57% (IQR 37–78) vs 20% (IQR 0–68)). Only 31% (n=28) studies recruited participants from LMICs. Regarding recruitment from LMICs, patients were under-represented (16% studies; n=8) compared with HCPs (22%; n=28). Few (7%; n=6) studies translated Delphi surveys. Only 3% studies (n=3) analysed Delphi responses by geographical location; all found differences in outcome prioritisation.

Conclusions There is a disproportionately lower inclusion of international patients, compared with HCPs, in COS-development Delphi surveys, particularly within LMICs. Future international Delphi surveys should consider exploring for geographical and income-based differences in outcome prioritisation.

PROSPERO registration number CRD42019138519.

  • Delphi
  • core outcome set
  • patient and public involvement
  • low and middle income countries
  • international

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • We conducted a comprehensive systematic search of the literature including three search engines, the Core Outcome Measures in Effectiveness Trials database and hand-searching.

  • This is the first review to present stakeholder-level data on the country of origin and income status of Delphi survey participants for core outcome set studies.

  • Study reporting on participant demographics was inconsistent and some data were not in a compatible format for data analysis.

  • The literature search was time limited (January 2017 to June 2019) and restricted to English language articles.


A core outcome set (COS) is a minimum group of outcomes to be reported in all trials of a specific health condition. Development is undertaken to reduce the heterogeneity of outcome reporting across trials and to enable study results to be compared and combined to inform best medical practice. Core outcomes are identified scientifically by stakeholders as being the most important in determining the effects of an intervention or treatment in one healthcare condition.1–3 Consensus in outcome prioritisation among stakeholders in COS development is often achieved using Delphi surveys.2 4 A diverse group of stakeholders is recommended to be recruited, including patients, healthcare professionals (HCPs), trialists, regulators, industry representatives, policymakers, researchers and the public.2 The Delphi process comprises iterative rounds of surveys in which the importance of outcomes is rated.2 After each round, the participants’ individual responses, and those of other stakeholders, are fed back in an anonymised manner, so that they can be reconsidered before the next round in an aim to achieve consensus. The Delphi method is advantageous because it incorporates the views of various stakeholder groups and can be conducted electronically (‘e-Delphi’) to facilitate international participation.1 2 4

International participation in Delphi surveys is important for the COS to be applicable in global healthcare settings, and because widespread uptake of the COS will facilitate the future synthesis of trial evidence on an international scale.2 An increasing number of COS developers are including international participants.5–93 A recent survey reported that approximately 50% of published COS projects from the last 5 years included participants from two or more countries.92 Despite this increase in international stakeholder participation, there is no agreement on how study methodology should be adapted to facilitate such input. The Core Outcome Measures in Effectiveness Trials (COMET) Handbook highlights the logistical and organisational challenges of international COS development projects as well as issues regarding generalisability of small international participant numbers.2 The effect of income group and/or geographical location of Delphi participants on core outcome prioritisation has not been systematically investigated, despite international variation in healthcare resources, biomedical beliefs and burden of disease.5 73 81

Using a systematic review of the literature, this study aims to explore participant representation in international Delphi surveys used for COS development. Part of our analysis will explore how COS projects undertaking international Delphi surveys evaluate the impact of participants from countries from different World Bank income groups on prioritising the importance of outcomes.


This systematic review adheres to a prespecified protocol and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement.94 The protocol for this review was registered on PROSPERO (available from:

Identification of studies

Study eligibility

COS development studies/protocols were included where international participants took part in a Delphi survey to prioritise outcomes for a COS, published between 1 January 2017 and 6 June 2019. The inclusion and exclusion criteria are shown in table 1. This time period was chosen to reflect the 2017 publication of the COMET Handbook2 to maximise study homogeneity. Protocol studies were included for methodological detail if equivalent full-length studies were not available, because the review focused on evaluating methodology, not study outcomes. Full-text and protocol studies describing the same international Delphi survey were merged as one reference.

Table 1

Study inclusion and exclusion criteria

Types of participants

For inclusion in the review, at least one participant in the Delphi survey had to be international, defined as a different country of origin to the first or senior author. Participants were included if they completed at least round 1 of the Delphi survey. Those who registered but did not participate were not included.

Types of interventions

Included studies used an international Delphi survey for prioritisation of outcomes at some point in the consensus process (including modified Delphi surveys and mixed methods approaches). Any study intervention and comparator were acceptable.

Types of outcome

Included studies used an international Delphi survey to prioritise any outcomes for a health condition or intervention to form a COS.

Search strategy

A search string was developed to identify relevant papers. These included key search terms and their synonyms (eg, COS, international, Delphi) and relevant medical subject headings. The search string for Ovid MEDLINE is shown in online supplemental appendix 1 and was adapted for different databases (Ovid EMBASE, Web of Science).

Supplemental material

Study selection process

Search results were compiled using Mendeley (V.1.19.4). Citations were deduplicated using Mendeley software and manually. Article screening was undertaken by one researcher (AL) against prespecified inclusion and exclusion criteria (table 1) in two stages (title and abstract and full text). For both stages, a second researcher (AD) independently assessed 20% of the screening results. Inter-rater reliability between researchers was assessed with Cohen’s kappa.95 If discrepancies in article selection could not be resolved, a third researcher (AY) was consulted.

Quality assessment

A risk of bias assessment was not undertaken because the review aimed to assess study methodology and not the effect of study interventions. There is currently no risk of bias assessment tool for COS development or Delphi surveys, and tools for assessing risk of bias in trials or observational studies are not applicable to these reviewed studies.

Data extraction

Data were extracted using a piloted data extraction form (Microsoft Excel) developed for the purpose of the review. Data were extracted under the following domains: study details (year of publication, full text/protocol study and COMET disease category), participant numbers, international status of participants, participant geographical location and income group, the effect of these on prioritisation and whether the Delphi survey was translated.

Number of Delphi participants overall and per stakeholder group

participants were grouped into two stakeholder groups: patients and representatives (carers and representatives from patient organisations) and HCPs (medical professionals, trialists, regulators, industry representatives, policymakers and researchers). Data were extracted on number of participants (total and per stakeholder group). We recorded number of participants per study based first on the number of participants from the Delphi round which included both patients and HCPs. If both stakeholder groups were included throughout or the study only included either patients or HCPs from the outset, we extracted number of participants from the Delphi round with the largest number of participants.

International status of participants and effect on outcome prioritisation

Participants were categorised as international if their country of origin was not the same as either the first or senior author of the study. Other demographic data included participants’ countries of origin and the World Bank world regions and World Bank income groups represented by these countries. All study texts were scrutinised for any description of analysis of Delphi responses by geographical location or income status, and if so, the outcome of this analysis.

Delphi survey translation

Studies that recruited participants from non-English-speaking countries were scrutinised for any description of survey translation, and if so, details of the translated languages and method of translation.

Data synthesis

Data from individual studies were tabulated by one author (AL). A second researcher (AD) independently extracted data from 20% of included studies. Inter-rater reliability was assessed with Cohen’s kappa.95 If discrepancies in data extraction could not be resolved, a third researcher (AY) was consulted. Quantitative, non-parametric data were analysed using Microsoft Excel to calculate medians and IQRs for number of participants, percentage of international participants and number of countries of origin, World Bank world regions and World Bank income groups. Categorical data were described narratively, including participant countries of origin, survey translation and the outcomes of analysis of responses by geographical location or income status.

Patient and public involvement

No patients were involved in this review of previously published data.


Identification of studies

The electronic search identified 529 non-duplicate citations, of which 90 were included in the final data set (figure 1). Cohen’s kappa demonstrated very good (0.81) and good (0.71) agreement between researchers performing screening at the abstract and full-text stages, respectively.96 Of the 90 included studies (online supplemental appendix 3), 69% (n=62) were completed and 31% (n=28) were study protocols. The greatest number of Delphi studies were published from the UK (42%; n=38; online supplemental appendix 3) and the three most frequent COMET disease categories were pregnancy and childbirth (14%, n=13), gastroenterology (9%, n=8) and orthopaedics and trauma (9%, n=8); online supplemental appendix 4.

Supplemental material

Supplemental material

Figure 1

Preferred Reporting Items for Systematic Reviews and Meta-Analyses flow diagram. COS, core outcome set.

Number of Delphi participants overall and per stakeholder group

The median number of participants per study was 100 (table 2). Most studies (77%; n=69) included both patients and HCPs. Of the 23% of studies (n=21) with only one stakeholder group, 95% (n=20) recruited only international HCPs. Of all studies, 70% (n=48) recruited international participants in both stakeholder groups, 19% (n=13) recruited into only one stakeholder group (92% of which recruited only HCPs) and in 12% (n=8), the status of stakeholder groups was unclear. Studies recruited two times as many HCPs as patients (median 60 vs 30 participants).

Table 2

Demographic data for the overall cohort, patients and HCPs

International status of participants and effect on outcome prioritisation

The median percentage of international participants per study was 52%. Studies recruited three times more international HCPs than international patients (57 vs 20%). The total number of countries represented across the included studies was 95 for HCPs and 46 for patients. Within these studies, the median number of countries represented in each Delphi was 11 for HCPs and 2 for patients.

Participants were recruited from every World Bank world region for both stakeholder groups. HCPs represented two times as many World Bank world regions as patients (four vs two regions). The most frequent countries of participant origin were the USA and UK for HCPs and patients (figure 2). The most frequent World Bank world regions reported for both HCPs and patients were Europe and Central Asia, followed by North America and East Asia and the Pacific (figure 3). Compared with HCPs, fewer studies recruited patients from certain world regions (Sub-Saharan Africa, Latin America and the Caribbean and East Asia and the Pacific) (figure 3). Only 4% studies (n=2) recruited patients from Sub-Saharan Africa when compared with studies recruiting HCPs (13%; n=10).

Figure 2

World map showing participant countries of origin for (A) patients and (B) HCPs. The colour gradient represents the percentage of studies which recruited patients and HCPs from each country. For patients, this ranged from 2% (light blue) to 28% (dark blue) and for HCPs this ranged from 1% (light blue) to 39% (dark blue). HCPs, healthcare professionals.

Figure 3

Distribution of (A) World Bank world regions and (B) World Bank income groups of participants by overall cohort and stakeholder group. HCPs, healthcare professionals.

Most studies recruited participants from high-income countries (48%; n=43), followed by high–middle income (28%; n=25), lower middle income (16%; n=15) and low income countries (9%; n=8; figure 3). HCPs were recruited from two times as many World Bank income groups as patients (two vs one groups). Less than half as many studies recruited patients from low-income (2%, n=1) and lower middle income countries (6%; n=3) when compared with studies recruiting HCPs (5%, n=4% and 14%, n=11, respectively).

A minority of studies (4%, n=4) analysed, or stated an intention to analyse, the Delphi survey responses by participants’ geographical location, either by country54 97 or continent.73 Some differences in outcome prioritisation were minor, not affecting the final COS.54 97 Park et al presented results from round 1 of their Delphi survey on patient-reported outcomes (PROs) for adult myositis.54 They found that, unlike participants from the USA and South Korea, Swedish participants rated ‘impact on household activity’ less favourably, although this outcome was still retained for the second Delphi round. Sautenet et al reported consensus outcomes for kidney transplantation among patients, caregivers and HCPs.97 They found that patients/caregivers from certain countries ranked ‘depression’ or ‘cognition’ as less important and that ‘skin cancer’ received greater prioritisation in countries with public campaigns for prevention. Since none of the aforementioned outcomes were ranked in the ‘top eight’ of either patients or HCPs, these differences did not affect the final COS. Van Rijssen et al 73 reported consensus PROs for pancreatic cancer among European, North American and Asian participants. In this study, the outcomes in the final COS would have been different if the responses were analysed by continent rather than as a whole cohort. In comparison to the whole cohort, European participants reached consensus ‘in’ on three additional outcomes, American participants reached consensus ‘in’ for one additional outcome, but did not reach consensus for two outcomes included in the final COS, and Asian participants did not reach consensus on any PROs included in the final COS. A fourth study (protocol) stated an intention to analyse Delphi survey results by ‘language and cultural variation’.17 None of the included studies analysed Delphi responses by income status.

Delphi survey translation

Delphi surveys were translated in a minority of studies that recruited participants from countries in which English was not the first language (16%, n=6/38). This included five studies with both stakeholder groups and one with patients only. Three studies translated the Delphi survey into all languages spoken by the patients54 73 83 and a protocol study mentioned translating the Delphi survey ‘as required’.98 One study translated the Delphi survey from English into Dutch only, despite recruiting patients from many other countries.27 One study translated the Delphi survey from English to Italian but did not list countries of origin for patients.99 Some studies (n=14) excluded non-English-speaking participants, despite recruiting patients and representatives from many countries where English is not the first language.8 13 18 19 21 22 25 36 55 56 59 60 69 79


The findings of this systematic review demonstrate that most international Delphi surveys for COS development recruited both HCPs and patients. Patients were recruited in fewer numbers and were less likely to be international and especially from LMICs. A minority of studies altered Delphi survey language, despite recruiting participants from several countries with non-English first languages. Importantly, few studies analysed Delphi responses by geographical location of participants, but those that did found differences in outcome prioritisation.

These findings echo those of annual systematic reviews of COS development studies, demonstrating lower overall recruitment of Delphi participants from LMICs.5 6 Our review adds new, stakeholder-level information, which has identified a disproportionately lower recruitment of patients (overall and particularly within LMICs). There are various possible explanations for this, including lower English language proficiency, reduced internet access (required for e-Delphi surveys and online recruitment), differing biomedical beliefs and lack of resources and/or time for voluntary research participation.100–102 Development of contextual methods to effectively engage international patients from LMICs is required.

Lack of Delphi survey translation is an important barrier to participation, particularly affecting patients from LMICs. Most included studies did not adapt survey language for participants and many excluded non-English speakers. This could have introduced recruitment bias, particularly within LMICs. Researchers have expressed concern that Delphi survey translation could result in loss of comprehension or meaning.2 Some of the included studies have approached this problem by translating outcomes by native-speaking professionals only,98 using an online multilanguage interface103 or by discussing translated outcomes with patient research partners.54 Van Rijssen et al 73 used a method of forward and backward translation, that is, translated the survey from English to the required participant languages and then back into English. Discrepancies between the forward and backward translations were resolved by consensus between the translators and if necessary, the project team. This is the method recommended by the COMET handbook,2 WHO104 and MAPI Institute105 for providing equivalent conceptual, cultural and semantic meaning and should be encouraged as the first-line method after consideration of the costs involved.

Increasing the international participation of Delphi surveys for COS development should be encouraged but raises considerations for study design. For example, how to ensure Delphi studies have appropriate and adequate international representation, the criteria to define this (geographically and/or using income status) and how to analyse international data. None of the included studies analysed Delphi responses by income status and few analysed Delphi responses by geographical location of participants. Those that did found differences in prioritisation between participants of different countries of origin54 60 and continents of origin.73 Importantly, in the study by Van Rijssen et al,73 which reported a core set of PROs for pancreatic cancer, participants from Europe, the USA and Asia did not reach the same consensus on the final COS, and Asian participants did not reach consensus on any PROs included in the final set. There is no clear explanation for these discrepancies. For COSs to have global uptake, it is important that the selected outcomes are representative of all international stakeholders. Performing subanalyses of Delphi responses by participant location and/or income status should be encouraged as a useful indication of applicability across different populations, and significant differences in prioritisation should be explored.

Another important finding of this systematic review was inconsistent study reporting. In some studies, basic demographic information was unclear or not specified including which stakeholder groups were international and from which countries and world regions they were recruited. Many studies only listed the most frequent countries of origin, provided data in non-standard formats, for example, reporting participant numbers for the USA and Canada combined, or only provided geographic detail for the whole cohort instead of per stakeholder group. Some studies provided demographic data on invited participants, rather than those who had completed at least one round of the Delphi survey. As a result, we were unable to use many studies’ data in our analyses. Without adequate demographic information, it is difficult to interpret the applicability of COSs across different populations. Current reporting checklists for COS development projects, such as Core Outcome Set–STAndards for Reporting (COS-STAR),106 could be adapted to reflect international participation.

The findings of this review should be interpreted in the context of its limitations. The HCP group consisted of various stakeholders with potentially differing views, including healthcare workers, trialists, regulators, industry representatives, policymakers and researchers. Not all studies included each of these constituent groups. For ease of comparison, and because the largest distinction in opinion was likely to be between HCPs and patients, all professionals were combined into one group. This approach has been used previously.107 We described patients as international if their country of origin was different to the affiliated country of either the first or last author of the study. Some authors had multiple affiliations, which may have reduced the apparent percentage of international participants. We used the World Bank world regions and income groups to categorise demographic information from participants, but some studies did not present their demographic data in a compatible format. Furthermore, we restricted our search to articles available in English and published within a recent time frame.


In conclusion, we have demonstrated that Delphi surveys used to develop COSs for clinical trials recruit fewer international patients than HCPs, particularly from LMICs. This could be contributed to by a lack of Delphi survey translation for non-English speakers. Few studies explored any geographical variation in responses; those that did found differences in outcome prioritisation. This review highlights complex issues that need further discussion, for example, how to define adequate international participation and how to analyse international data, including geographical discrepancies in outcome selection. Future studies should consider exploratory analyses of Delphi responses by geographical location or income status to assess applicability of the COS across populations. Study reporting was inconsistent and could be improved with a standardised checklist for international Delphi surveys.

Supplemental material


Supplementary materials


  • Twitter @AliceEALee

  • Contributors AY devised the initial review question and provided critical review of the search strategy, data collection process and manuscript. AL designed the search strategy and was the primary researcher involved in screening search results, data extraction and drafting of the manuscript. AD provided critical review of the search strategy and acted as a second researcher for purposes of screening and data extraction, as well as critically reviewing the manuscript.

  • Funding This research (AL) was funded by The Scar Free Foundation. The Scar Free Foundation is the only medical research charity focused on scarring with the mission to achieve scar free healing within a generation. This study was also supported by the NIHR Biomedical Research Centre at University Hospitals Bristol and Weston NHS Foundation Trust and the University of Bristol. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.

  • Map disclaimer The depiction of boundaries on this map does not imply the expression of any opinion whatsoever on the part of BMJ (or any member of its group) concerning the legal status of any country, territory, jurisdiction or area or of its authorities. This map is provided without any warranty of any kind, either express or implied.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data are available in the supplementary material. Requests for data not included in the article or supplementary material can be made to the corresponding author.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.