Identification and categorisation of relevant outcomes for symptomatic uncomplicated gallstone disease: in-depth analysis to inform the development of a core outcome set

Background Many completed trials of interventions for uncomplicated gallstone disease are not as helpful as they could be due to lack of standardisation across studies, outcome definition, collection and reporting. This heterogeneity of outcomes across studies hampers useful synthesis of primary studies and ultimately negatively impacts on decision making by all stakeholders. Core outcome sets offer a potential solution to this problem of heterogeneity and concerns over whether the ‘right’ outcomes are being measured. One of the first steps in core outcome set generation is to identify the range of outcomes reported (in the literature or by patients directly) that are considered important. Objectives To develop a systematic map that examines the variation in outcome reporting of interventions for uncomplicated symptomatic gallstone disease, and to identify other outcomes of importance to patients with gallstones not previously measured or reported in interventional studies. Results The literature search identified 794 potentially relevant titles and abstracts of which 137 were deemed eligible for inclusion. A total of 129 randomised controlled trials, 4 gallstone disease specific patient-reported outcome measures (PROMs) and 8 qualitative studies were included. This was supplemented with data from 6 individual interviews, 1 focus group (n=5 participants) and analysis of 20 consultations. A total of 386 individual recorded outcomes were identified across the combined evidence: 330 outcomes (which were reported 1147 times) from trials evaluating interventions, 22 outcomes from PROMs, 17 outcomes from existing qualitative studies and 17 outcomes from primary qualitative research. Areas of overlap between the evidence sources existed but also the primary research contributed new, unreported in this context, outcomes. Conclusions This study took a rigorous approach to catalogue and map the outcomes of importance in gallstone disease to enhance the development of the COS ‘long’ list. A COS for uncomplicated gallstone disease that considers the views of all relevant stakeholders is needed.

ABSTRACT Background Many completed trials of interventions for uncomplicated gallstone disease are not as helpful as they could be due to lack of standardisation across studies, outcome definition, collection and reporting. This heterogeneity of outcomes across studies hampers useful synthesis of primary studies and ultimately negatively impacts on decision making by all stakeholders. Core outcome sets offer a potential solution to this problem of heterogeneity and concerns over whether the 'right' outcomes are being measured. One of the first steps in core outcome set generation is to identify the range of outcomes reported (in the literature or by patients directly) that are considered important. Objectives To develop a systematic map that examines the variation in outcome reporting of interventions for uncomplicated symptomatic gallstone disease, and to identify other outcomes of importance to patients with gallstones not previously measured or reported in interventional studies. Results The literature search identified 794 potentially relevant titles and abstracts of which 137 were deemed eligible for inclusion. A total of 129 randomised controlled trials, 4 gallstone disease specific patient-reported outcome measures (PROMs) and 8 qualitative studies were included. This was supplemented with data from 6 individual interviews, 1 focus group (n=5 participants) and analysis of 20 consultations. A total of 386 individual recorded outcomes were identified across the combined evidence: 330 outcomes (which were reported 1147 times) from trials evaluating interventions, 22 outcomes from PROMs, 17 outcomes from existing qualitative studies and 17 outcomes from primary qualitative research. Areas of overlap between the evidence sources existed but also the primary research contributed new, unreported in this context, outcomes. Conclusions This study took a rigorous approach to catalogue and map the outcomes of importance in gallstone disease to enhance the development of the COS 'long' list. A COS for uncomplicated gallstone disease that considers the views of all relevant stakeholders is needed.

BACKGROUND
Gallstone disease (cholelithiasis) is one of the most common gastrointestinal disorders worldwide. The prevalence of gallstones is approximately 10%-15% in adult populations and they are more common in women and people over the age of 40. 1 Approximately 80% of those affected by gallstone disease are asymptomatic and can remain so for many years without requiring treatment. However, around 20% of patients with gallstones become symptomatic and develop gallstone-related complications. These patients are offered symptom control and/or surgical/endoscopic intervention. 2 A significant number of these patients remain symptomatic only (ie, experiencing pain) without developing gallstone-related complications. 2 Recommendations from the recent National Institute for Health and Care Excellence guideline on Gallstone Disease Strengths and limitations of this study ► This outcome map review is the first to describe the heterogeneity in outcome reporting within uncomplicated symptomatic gallstone disease. ► There is a detailed analysis of all reported outcomes from a range of study designs (both primary and secondary) reporting outcomes of clinical and/or patient relevance. ► A mixed-methods approach was used in both collection and analysis of data. ► Only studies reported in English language were included in the analysis ► Quality assessment of included studies was not conducted as the main purpose of the review was to extract clinical outcomes not to assess intervention effectiveness.

Open access
have clearly demonstrated insufficient information for patients with gallstones on the effect of cholecystectomy on patient outcomes. 3 The guideline recommends 'research is needed to establish the long-term patient benefits and harms, so that appropriate information can be provided to patients to aid decision-making and longterm management of their condition.' 3 However, many completed trials are not as helpful as they could be due to lack of standardisation across studies, outcome definition, collection and reporting. This heterogeneity of outcomes across studies also hampers useful synthesis of primary studies in meta-analyses and ultimately negatively impacts on decision making by all stakeholders. In addition to the heterogeneity of outcomes currently reported and the problems this causes, measuring the wrong outcomes (ie, those that are not valued by clinicians or, more importantly, patients) could also be a real risk for many studies if stakeholders are not consulted during the trial design process. One way that these problems with heterogeneity and relevance to stakeholders can be addressed is through the development and use of core outcome sets. 4 5 There is currently no agreed published core outcome set for evaluating interventions to treat symptomatic uncomplicated gallstone disease. Core outcome sets (COS) aim to define a minimum set of outcomes that should be considered essential for the evaluation and reporting of specific interventions or conditions (ie, the set of outcomes that should always be considered and ideally measured in any evaluation). 4 5 There is a growing body of literature to provide support for development of core outcome sets. 5 Specifically, they are developed using consensus methods involving stakeholder groups, such as health professionals and patients, so as to ensure that the outcomes being defined are both clinically and personally relevant for the individuals involved. 4 5 Assessment of a core outcome set is not expected to be mutually exclusive to the measurement of other outcomes. However, a core set will foster greater consistency in outcome reporting between studies and lead to more meaningful data being available to contribute to meta-analysis. 4 5 Moreover, core outcome sets can minimise the threat of outcome reporting bias by ensuring consistency between what is measured and what is reported. 4 5 Ultimately, they should improve the overall efficiency and quality of the evidence on which healthcare decisions can be made.
A core outcome set for uncomplicated gallstone disease is currently being developed. Details of this project have been registered and included in the Core Outcome Measures in Effectiveness Trials (COMET) Initiative database. 6 Outcome mapping is an important step in the development of core outcome sets, to present and catalogue the outcomes reported to date, and links the literature review and the subsequent process of consultation and consensus. 7 Therefore, the objectives of this paper are to document the outcome mapping process in the development of the core outcome set for symptomatic uncomplicated gallstone disease.

METHODS
The protocol for development of the COS is available on the COMET website: http://www. comet-initiative. org/ studies/ details/ 927? result= true.
Identification of outcomes relevant for symptomatic uncomplicated gallstone disease The identification of outcomes was informed by two sources: existing evidence and new primary research. The specifics of these are detailed below. Identification of outcomes from existing literature: outcomes reported in trials of interventions for symptomatic uncomplicated gallstone disease and identification of disease specific PROMs Reported outcomes of interventions for symptomatic uncomplicated gallstone disease were identified by updating the search strategy for a recent systematic review (Brazzelli et al 8 ; this review included two randomised controlled trial, RCTs), by conducting a search for relevant PROMs and by screening the reference lists of relevant Cochrane reviews. In addition, reference lists of systematic reviews identified by the search strategy were checked for relevant RCTs.

Search methods for identification of studies
Extensive electronic searches were undertaken to identify trials for a project on the clinical effectiveness of cholecystectomy and these are reported in full elsewhere. 8  In addition, a specific search of MEDLINE and EMBASE was undertaken to identify studies that report PROM outcome data for cholecystitis, with records retrieved by the main search for trials excluded to avoid duplication of the results. This search was undertaken in May 2016 (1980 to May 2016). The search strategies for MEDLINE and EMBASE are reported in online supplemental appendix 1. Inclusion criteria for eligible studies were as follows; Participants: Adults aged over 18 years with symptomatic uncomplicated gallstone disease. Intervention and comparator: Any intervention (surgical or non-surgical management, ie, expectant management or dietary advice or medical therapy) used to manage symptomatic uncomplicated gallstone disease in adults. Outcomes: All reported outcomes well eligible for inclusion. Excluded studies included those focusing on asymptomatic gallstone disease or on acute severe cholecystitis, cholangitis, or pancreatitis were not considered suitable for inclusion. In addition, studies including 'complex' gallstone cases that is, empyema, ascending cholangitis and gallstone ileus, were excluded. Reports published in non-English languages for which a translation could not be organised were also excluded. In addition, lists of included and excluded studies reported in several relevant Cochrane reviews were checked by one reviewer (MC) for potentially relevant studies. [9][10][11][12][13][14] Study selection and data extraction One reviewer (MC) screened all titles and abstracts identified by the two search strategies and a second reviewer (RN) checked a 10% random sample. All full-text papers considered potentially eligible were screened by one reviewer (MC) and checked by a second reviewer (RN). One reviewer (MC) extracted details of all outcomes reported (verbatim) and any reported definition of outcomes provided by the authors (eg, operating time may have been defined and reported by some studies as 'interval between initial skin incision and sin closure' others 'duration of surgery' etc. The definition reported by study authors was used to when deduplicating items into a shorter list. Data were recorded in a Microsoft Excel file. A 10% sample was checked by a second reviewer (RN). Other relevant data (ie, study and participant characteristics) were extracted by one reviewer (MBe) and checked by a second reviewer (MC). At all stages, disagreement between reviewers was resolved by discussion.

Data extraction from PROMs
From the list of outcomes reported in trials of interventions for symptomatic uncomplicated gallstone disease described above, disease specific PROMS were identified and supplemented with the studies identified in the search. Data were extracted by one reviewer (KG) who recorded the name of the PROM(s), the reported PRO scales and individual verbatim items. The individual verbatim items from each PROM were analysed using an inductive content analysis approach and informed by previous PROM coding work. 15 All PROM items were systematically categorised into conceptual health domains according to the aspect which they aim to capture. Health domains were generated inductively from the identified individual items. Domain mapping was conducted two authors (KG and JB) independently with any conflicts resolved through discussion.
Identification of outcomes from existing literature: outcomes reported as important by patients with a lived experience of uncomplicated gallstone disease Search methods for identification of studies A search for relevant qualitative studies was undertaken in August 2016 in the Ovid versions of MEDLINE (from 1966 to 2016). The search strategies combined search terms for cholecystitis, cholecystectomy, with terms for qualitative research (online supplemental appendix 1). Inclusion criteria included (1) studies that have explored (using observations, interviews, focus groups and other qualitative methods) participants' lived experiences of gallstones with specific reference to outcomes of importance. Exclusion criteria included (1) studies that have not used qualitative methods; (2) any review articles, conference abstracts and those with no full-text articles published or non-English language articles.

Study selection and data extraction
One researcher (RN) screened all abstracts and another (KG) screening a random 10% sample. Full-text articles were obtained for those that were potentially relevant. Two researchers (RN and KG) reviewed all potentially relevant articles to ensure they met the inclusion criteria. To identify additional relevant studies, the reference lists of the included studies were also examined.
Data on study characteristics such as author, publication date, country, focus of investigation, data collection methods, number of participants and details on sample size were extracted. Additionally, two authors (KG and RN) independently extracted data from two main sources reporting study findings: (1) Direct quotes from participants and (2) Authors interpretations of participants quotes. These data were recorded verbatim and analysed to identify 'descriptive' thematic codes. Constant comparison method was used to compare findings across studies and an inductive thematic synthesis was undertaken to generate a list of themes and subthemes (focused on outcomes) from the data to map across the presurgery and postsurgery timeline. 16 17 Throughout this process, the description and wording of the themes were continually revised, and notes made as to how themes and/or subthemes related and how some could be merged. These findings were discussed further with the research team to finalise the themes across the studies and these were considered, where appropriate, as domains relevant for inclusion in the development of the COS.

Open access
Identification of outcomes of relevance to patients from new primary research In addition to outcomes reported in existing literature, we conducted primary qualitative research to further inform the identification of outcomes of relevance to patients. Three activities were conducted: 1. Interviews with patients with a diagnosis of symptomatic uncomplicated gallstone disease. 2. A focus group with patients who had undergone cholecystectomy. 3. Analysis of audioconsultations from a trial comparing surgical versus medical management of symptomatic uncomplicated gallstone disease.

Participant identification and invitation
Potential participants for the interviews were identified from an ongoing trial comparing laprascopic cholecystectomy with observation/conservative management for preventing recurrent symptoms in adutls with uncomplicated symptomatic gallstones (CGALL trial). Participants were provided with a study participant information leaflet (PIL) either in the clinic or posted to the participant if a decision about CGALL trial entry was made later. 18 The PIL contained a detachable reply-slip to complete and return to the researcher (in a reply paid envelope) if they would like to discuss participating in the interview study.
Patients being approached to participate in the CGALL trial were asked (for trial purposes) if they would consent to their consultation being audiorecorded. If consent was obtained, these audiorecordings were then analysed for the identification of outcomes. Focus group participants were identified through the Scottish Health Research register (SHARE -https://www. registerforshare. org/) and sent an invitation letter asking them to contact the research team if interested in participating. Following initial contact, a researcher phoned the interested participants and ensured they were clear about what the study entailed and arranged a suitable time for the focus group.

Data collection, management and analysis
One author (RN) conducted the interviews over the telephone between (April-and August, 2017). The focus group was conducted by two members of the trial team (KG and the PPI partner BC) on 20 July 2017. Trial consultations were conducted as standard and four sites across the trial were sampled to inform outcome identification. Informed (written and recorded) consent was obtained from all participants prior to data collection, and confidentiality of the participants was assured. Participants were encouraged to consider what aspects of their disease or treatment impacted them most, both in terms of physical and psychological functioning and what improvements they would wish to see in terms of outcomes. Interviews, focus group and audioconsultations were audiorecorded and transcribed verbatim using a professional transcription service. All transcripts were imported into NVivo (V.10, 2013: QSR International) and analysed using conventional content analysis (ie, coding categories are derived directly from the text data and are used to interpret meaning from the content). 19 Various themes and subthemes were generated by one researcher (RN) based on the contents of the transcripts to identify the outcomes stated by the participants, and these were then further discussed (with KG) to finalise the list of outcomes identified across the primary qualitative data. The analysis was oriented to address the aim of identifying the range of outcomes that might be considered important and the reasons used to justify assessment of them as important.

Categorisation of identified outcomes into outcome domains
The list of potential outcomes generated from the systematic evidence search and primary qualitative research formed the basis of a 'long' list of outcomes used to refine the items into a final 'short' list for inclusion in the Delphi stage of core outcome set development. Outcomes were first grouped and reduced according to original source, that is, the initial long list from the evidence review was reduced for duplication by two members of the research team (MC and IA). A similar process was conducted to deduplicate the outcomes identified from the PROM coding, qualitative evidence synthesis and primary qualitative research (KG and RN). These outcomes lists were then merged to identify areas of overlap and reduce for further duplication through iterative group discussions (further addressing duplications and relevance of outcomes for effectiveness trials) to produce a final short list OF individual outcomes of relevance in this context.
Individual outcome items were further grouped into broader concept level headings to categorise outcome domains. These concept-level headings were informed by other outcome categorisation work in the area of COS and supplemented through study management group discussion. The categorisation was performed by one member of the team (KG) and refined through iterative discussions. 15 20 21 Patient and public involvement The outcome mapping work reported in this manuscript has involved the input of patients from inception, through design, conduct and reporting. Coauthor BC is a patient partner and has been involved in all phases of project design and delivery and specifically helped to facilitate the patient focus groups to identify outcomes of importance to patients. Existing literature Seven studies were conducted in the UK. A further 44 were conducted in countries in Europe, other than the UK, 12 in the USA and 66 in a range of other countries. A total of over 10 000 participants were included in the studies, predominantly women (sex ratio around 2.5:1) in their mid-40s (40-53, median 46.1). Studies were mainly singlecentre (n=113) and ranged in size from small (14 participants) to large (618 participants) trials (median n=75). A vast majority of trials involved one type of surgery vs another type of surgery, with single incision laparoscopic cholecystectomy vs conventional laparoscopic cholecystectomy being the most common trial configuration.  Other common configurations were early laparoscopic cholecystectomy versus delayed laparoscopic cholecystectomy, 72-87 mini laparoscopic cholecystectomy versus laparoscopic cholecystectomy, 88-97 mini laparotomy versus laparoscopic cholecystectomy [98][99][100][101][102][103] and day-case laparoscopic cholecystectomy versus overnight stay laparoscopic cholecystectomy. [104][105][106][107] There were few non-surgical interventions; one study compared shock wave lithotripsy and laparoscopic cholecystectomy, 108 while several compared observation with cholecystectomy. 109 110 The four disease specific PROMS for uncomplicated gallstone disease were developed in Canada (n=2), .New Zealand (n=1) and Germany (n=1). [111][112][113][114] Two of the PROMS focused on gallstone disease, 111 113 one on gastrointestinal diseases, 114 and the other on quality of life after abdominal surgery. 112 All of the PROMs reported to measure multidimensional constructs, for example, quality of life. The PROMs varied in the number of constructs they aimed to assess (ranging from 4-8) and the number of items they asked participants to report on (ranging from 5 to 41, median=27).

Open access
The eight qualitative studies from seven different countries: USA (n=2), 115 116 UK, 117 Brazil, 118 Canada, 119 Sweden, 120 the Netherland 121 and Spain 122 were identified including 324 participants (ranged from 12 to 162, median=19.5). They were predominantly women and their age ranged from 19 to 81 years. Seven studies used interviews either face to face (n=5) or by telephone (n=2) and one used focus groups for data collection. All of the treatments being explored in the studies were surgery but the types of surgery varied. Five studies investigated patients' experiences after surgery, two investigated experiences of cholecystitis (ie, inflammation of the gall bladder) and surgery and another investigated experiences of cholelithiasis (ie, presence of gallstones). [115][116][117][118][119][120][121][122] Table 1 provides a summary of the characteristics of participants included in the quantitative and qualitative literature review.

Primary research
Six individual interviews, one focus group (n=5 participants) and analysis of 20 consultations (5 each from 4 different hospital sites) were conducted providing data from a total sample of 31 patients. A brief description of the participants is provided in online supplemental appendix 2. They included 26 women and five men, 26 of whom had been approached to take part in the CGALL trial (12 trial consenters and 14 trial non-consenters), and 5 patients whom had not been approached about the trial but who had all had a cholecystectomy. The CGALL trial is an RCT comparing laparoscopic cholecystectomy with conservative management for preventing recurrent symptoms and complications in adults with uncomplicated symptomatic gallstones. 18 Outcomes identified from existing literature and new primary research A total of 386 individual recorded outcomes were identified across the combined evidence from existing literature and new primary research: 330 outcomes (which were reported 1147 times) from trials evaluating interventions, 22 outcomes from PROMs, 17 outcomes from existing qualitative studies, and 17 outcomes from primary qualitative research.
Of the 330 individual outcomes reported in trials, 97 (29.4%) were reported as 'primary outcomes' with only 64 (19.4%) being formally defined and 227 (68.8%) were reported by one study only. The three 'verbatim' outcomes which were reported most frequently were:  Table 2 provides a list, and frequency, of all outcomes reported in the included trials. Some 106 individual items were identified from the four gallstone disease specific PROMS covering 22 individual outcomes, with frequency of each outcome varying across the individual PROMs (table 3). Included PROMs covered between eight and 14 domains. None of the included PROMS reported whether patients were involved in the measures development. Pain and emotional outcomes were the most frequently covered with 17 items each (making up 32% of total items) and reported across all four PROMs. There were seven outcomes identified only once (and four of which were identified in one PROM) across the 106 items, which included thirst/dehydration, cognitive, service use, body image, sexual function, regurgitation and swallowing. Open access

Open access
Seventeen individual outcomes were identified from the existing qualitative literature (see table 4). Twelve overlapped with existing outcomes identified in the literature and the PROMS leaving five additional outcomes for consideration in the long list, namely: dizziness, fainting, trust, weight and prevention of additional disease.
The primary qualitative research identified 17 individual outcomes, with the majority (n=14) overlapping with those reported in the previously reviewed evidence (See  table 4). However, three additional outcomes (breathing problems, cough and mortality) were identified that were not in the previous patient focused evidence (PROMs and qualitative literature), with two of these (breathing problems, cough) making unique contributions to the overall outcome list.
The 390 individual reported outcomes across the 4 data sources were reduced into a 'short' list of outcomes which could be measured in comparative effectiveness trials (ie, phase III pragmatic evections trials) of interventions to treat uncomplicated symptomatic gallstone disease (see table 5). This resulted in several outcomes being dropped from the long list as deemed not eligible as clinical endpoint outcomes for use in trials of this type (eg, system and process outcomes such as duration of surgery which might be important in earlier phase trials). Therefore, the final list covered 27 broad outcome domains Gallstone-associated events after randomisation 2 Open access that contained 41 distinct outcomes. The domains of pain, intra-operative complications, and post-operative complications contained the most outcomes (n=4 each).

DISCUSSION
Currently, there is a lack of consistency in the selection, measurement and reporting of outcomes for uncomplicated gallstone disease. This leads to challenges in evidence synthesis and decision-making. A Core Outcome Set would be an important step to improve this situation. This paper which describes an outcome mapping exercise is the first comprehensive step in the development of a core outcome set. It catalogues and reports the outcomes that have been measured in trials of interventions to treatment uncomplicated gallstone disease. It extends this initial phase of outcome identification to include outcomes from PROMs, published qualitative evidence and empirical qualitative research. Over 1000 verbatim outcomes were identified and reduced through deduplication to 390. This was further reduced to 41 outcomes spanning 27 domains. The next steps in this work are to reach consensus for the COS for uncomplicated symptomatic gall stone disease. As with many other outcome mapping exercises, this first stage of this study highlights the significant heterogeneity that exists within clinical trials comparing treatments for gallstone disease. Of the 334 outcomes, which were reported multiple times across the 129 RCTs, almost 70% were reported by only one study-a finding comparable with other outcome mapping studies. 21 123 124 All of the effort into collection and reporting of these outcomes is likely wasted as it is doubtful that they could be combined with others to make more confident assessment of the effectiveness of treatments. This outcome heterogeneity in existing trials is further emphasised when considering the outcome of pain, which is reported in 72 trials as postoperative pain but also reported in a number of other trials using 15 different outcomes. The four disease specific PROMs identified further extend the problem of outcome heterogeneity. While all of these measures report to capture quality of life, there is variability in both the inclusion and emphasis of domains In addition, the relevance of these outcomes to patients must be called into question given the lack of reporting of input from patients in the item inception phase of PROM design across these measures. Two reviews published after completion of this work conducted an methodological assessments of both disease specific and generic PROMs for laparoscopic cholecystectomy and both report considerable variation and a lack of patient involvement. 125 126 There are now several reviews of PROMs in other clinical specialties that also provide findings which further highlight the heterogeneity that exists across measures which, on the surface, report to measure similar concepts. 20 127-129 The different evidence sources contributed to the final outcome short list in a variety of ways with the outcomes reported in previous trials often capturing clinically focused endpoints and the PROMS and qualitative research providing more patient focused outcomes. When considering what outcomes matter to patients and how this contributes to the outcome mapping in this area, this study used two approaches to ensure patient relevant outcomes were included. The qualitative evidence synthesis and primary research identified a total of 34 combined (of which 21 were mutually exclusive) outcomes, that were important to patients in terms of their gallstone disease or their perceptions about treatments. Theses outcomes could be broadly grouped into physical and social functioning with most reports from participants focusing on reduction in pain and a desire to 'return to normal'. When compared with the outcome domains reported in the PROMs, there was considerable overlap between the two sources. However, there were some areas of discordance between the different sources, with the qualitative data adding a further eight outcomes. In addition the qualitative evidence synthesis and the empirical research identified outcomes not previously measured or reported in comparative effectiveness trials for uncomplicated gallstone disease. This value of including evidence from existing literature exploring patients perspective and/or new primary research to identify patient relevant outcomes is gaining traction among COS developed. 5 Where most have included outcomes through identification in interviews, the use of qualitative evidence synthesis is growing, and especially in areas where there has previously been a considerable volume of work to draw on. 130 These studies have shown that this work contributes previously unreported outcomes that are of importance to patients and hence underpins the critical nature of this step in COS development. Whether these outcomes identified in these list development stages end up making it into the COS    Open access is currently less well evidenced but will be important to know.

Strengths and limitations
This outcome mapping exercise used a systematic search to identify outcomes reported in both quantitative and qualitative studies in the literature. In addition to this rigorous systematic search we supplemented the pool of outcomes already available with new primary qualitative research (through three methods) to further identify outcomes that matter to patients who are experiencing uncomplicates symptomatic gallstone disease. This complementary approach to identification of outcomes has ensured a broad catalogue of both clinically relevant and patient important outcomes. Limitations of this review are linked to the inclusion of only English language studies, a lack of quality appraisal of included studies, and no assessment of reporting bias. While these decisions were fit for purpose for the COS activity, they may have introduced potential reporting and selection bias within the outcome map. With regard to outcome reporting bias, other COS development papers have explored this and found that in surgical studies of oesophagectomy and colorectal cancer resection papers frequently did not report all the outcomes intended to be measured (50% at least or more did not do that). 131 132 Future COS development studies should consider this approach to assess outcome reporting bias. It would also have been useful to collect the study teams reported rationale for the selection of reported outcomes to determine how that process was determined.

CONCLUSIONS
This study took a rigorous approach to catalogue and map the outcomes of importance in gallstone disease to enhance the development of the COS 'long' list. The synthesis of data from the four different evidence sources further underpinned the need for a COS in this space due to the heterogeneity of outcome measurement and reporting. However, the extensive use of data sources to contribute to the development of the list of outcomes for further consensus agreement, did highlight 'new' outcomes that have not been previously reported for trials evaluating interventions for gallstone disease and many of these 'new' outcomes were those reported by patients. This comprehensive approach to the development of the long list, and then ultimately the short list for scoring in a COS, gives confidence that both clinically relevant and patient focused outcomes have been considered and have the potential to be represented in the agreed COS.