Systematic review of clinical outcome reporting in randomised controlled trials of burn care

Introduction Systematic reviews collate trial data to provide evidence to support clinical decision-making. For effective synthesis, there must be consistency in outcome reporting. There is no agreed set of outcomes for reporting the effect of burn care interventions. Issues with outcome reporting have been identified, although not systematically investigated. This study gathers empirical evidence on any variation in outcome reporting and assesses the need for a core outcome set for burn care research. Methods Electronic searches of four search engines were undertaken from January 2012 to December 2016 for randomised controlled trials (RCTs), using medical subject headings and free text terms including ‘burn’, ‘scald’ ‘thermal injury’ and ‘RCT’. Two authors independently screened papers, extracted outcomes verbatim and recorded the timing of outcome measurement. Duplicate outcomes (exact wording ± different spelling), similar outcomes (albumin in blood, serum albumin) and identical outcomes measured at different times were removed. Variation in outcome reporting was determined by assessing the number of unique outcomes reported across all included trials. Outcomes were classified into domains. Bias was reduced using five researchers and a patient working independently and together. Results 147 trials were included, of which 127 (86.4%) were RCTs, 13 (8.8%) pilot studies and 7 (4.8%) RCT protocols. 1494 verbatim clinical outcomes were reported; 955 were unique. 76.8% of outcomes were measured within 6 months of injury. Commonly reported outcomes were defined differently. Numbers of unique outcomes per trial varied from one to 37 (median 9; IQR 5,13). No single outcome was reported across all studies demonstrating inconsistency of reporting. Outcomes were classified into 54 domains. Numbers of outcomes per domain ranged from 1 to 166 (median 11; IQR 3,24). Conclusions This review has demonstrated heterogeneity in outcome reporting in burn care research which will hinder amalgamation of study data. We recommend the development of a Core Outcome Set. PROSPERO registration number CRD42017060908.


Introduction
Each year an estimated 2-300,000 people die from burn injuries globally(1). Millions more suffer from burnrelated disabilities and disfigurements (2). These injuries have functional, psychological, social and economic effects on survivors and their families. There are multiple strategies for managing burn wounds and associated impact on patient physiology, with new care pathways and technology being introduced on a regular basis (3)(4)(5).
The choice of treatment should be made using up-to-date, high quality scientific evidence (6,7). Systematic reviews of randomised controlled trials (RCT) are regarded as the highest quality evidence (8)(9)(10). Despite increasing numbers of published randomised controlled trials (RCTs) in burn care, systematic reviews have not provided evidence to support many commonly used interventions or management strategies (11)(12)(13).
A well-designed RCT requires that outcomes are pre-specified. Evidence synthesis requires that these outcomes are consistent across RCTs in the same healthcare area (14). In the context of clinical trials, Williamson et al in the Core Outcome Measures in Effectiveness Trials (COMET) handbook, define an outcome as "a measurement or observation used to capture and assess the effect of treatment such as assessment of side effects (risk) or effectiveness (benefits)" (15). Chan adds a temporal element: "a variable measured at a specific time point to assess the efficacy or harm of an intervention" (16). If RCTs report outcomes that cannot be collated due to differences in choice, definition or timepoint of assessment, evidence synthesis will not be effective or efficient.
There is no agreed minimum set of outcomes important to patients and professionals for reporting in burn care trials and problems with outcome reporting in burn care research have previously been suggested (17)(18)(19).
Pre-specifying outcomes requires research to determine and agree the most important outcomes for a clinical condition. If this is not undertaken, the outcomes reported may not reflect patients' or other stakeholders' needs, outcomes will vary between studies (outcome reporting heterogeneity) and it will be difficult to determine if authors have reported all the outcomes they measured (outcome reporting bias) (20,21). Choosing the most important outcomes to measure in burn care is complex, as patients are a heterogeneous population, with variations in age, mechanism of injury, depth, site and size of burn (22,23). The time frame at which outcomes are measured may also determine the types of outcomes assessed. Outcomes reported in clinical trials during the acute treatment phase include healing time, skin-graft loss, infection rates and NHS costs (24)(25)(26)(27). Longer-term reported outcomes relate to functional, cosmetic and psychological issues (28).
To date, there has been no formal investigation into outcome reporting in trials of burn care. The purpose of this study is to examine clinical outcome reporting in burn care research, to consider the types, definitions and timing of outcomes measured and to consider the need for a Core Outcome Set in this field.

Methods
This review is focused on clinical observer-assessed outcomes reported in RCTs assessing the impact of interventions in burn care. It adhered to a pre-specified protocol and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (29). It was registered with the PROSPERO international prospective register of systematic reviews (ID CRD42017060908).

Study Eligibility
Studies were included if they met the following: Types of studies: We included full text RCTs along with RCT protocols and pilot studies. The study design was limited to RCTs, as a final core outcome set will be used for RCT reporting (29). We excluded protocols and pilot studies if the full RCT had been published. We also excluded conference proceedings and abstracts, non-English language publications and studies not involving human subjects.

Types of participants:
We included studies recording outcomes from patients of any age with a cutaneous burn of any type or size, determined by either clinician evaluation or objective assessment, or both, which required treatment in any health care facility. Studies where the population consisted of patients with combined thermal and mechanical injuries were only included if it was possible to separate out the burn care outcomes. Trials studying patients with pure carbon monoxide poisoning, chemical ocular or caustic oesophageal burns were excluded, as the former does not involve a burn and the latter have different aetiology and management to cutaneous burns.

Type of interventions:
Any surgical or non-surgical burn care intervention with any appropriate comparator.
Types of outcomes: defined as the exact terms used in a published trial Abstract, Methods or Results including tables and figures for any observer-reported clinical endpoint. These included physiological, metabolic or adverse or mortality events measured by researchers and relevant to patients' recovery and long-term wellbeing after burn care (30). Trials assessing quality of life were only included if the data were observer-reported.  Cohen's Kappa was calculated to report inter-rater reliability of selection at full text stage for both inclusion and reason for exclusion, corrected for chance agreement (28). Data were managed in a Microsoft Excel database.

Study selection process
Prior to both abstract and full-text screening, all review authors underwent training to ensure a comparable understanding of the purpose of the review and the eligibility criteria. The reference management software EndNote (Endnote 6, Clarivate Analytics, Boston; available at http://endnote.com/) was used to compile all titles derived from the initial searches, with duplicates removed for the review authors to screen titles and abstracts against the eligibility criteria. Screening of titles and abstracts was completed independently, then in duplicate by two authors (AY, AD) with experience in systematic review methodology. All screening disagreements were discussed, with any outstanding disagreements resolved by an independent reviewer (JB). Any studies appearing to meet the inclusion criteria based on the abstract were retrieved as full-text articles. Two reviewers then read the full-text articles in their entirety to assess for eligibility, with decisions on inclusion and exclusion recorded ( Figure 1 for flow diagram). Reasons for exclusion were ordered hierarchically (Table 1) and applied to each full text. The highest reason for exclusion met by a paper was recorded as its reason for exclusion. Any disagreements were discussed with another author (JB).

Quality Assessment.
The aim of this study was to comprehensively document clinical outcomes recorded in burn care RCTs and not to synthesise data about the effect of interventions. An assessment of risk of bias of the included trials was therefore not undertaken.

Data Extraction
Data was extracted into a standardised data extraction sheet. This included study author, country or countries recruiting (categorised into the United Nations six regions (31)), publication year, number of sites and number of participants recruited per trial, design (full RCT, pilot, protocol) and intervention tested. For protocols, the planned participant inclusion criteria and sample size were extracted.
No distinction was made between primary or secondary outcomes, although this was noted and is part of a separate project. All outcomes were extracted verbatim, with 20% of the extracted data verified by a second reviewer. True duplicates, spelled and worded the same, were deleted. As a second process, two reviewers (a clinician and researcher) discussed all verbatim outcomes to assess duplicates in meaning but spelled or worded and levels, and serum IL-10 and IL-10 in blood. These were named as one outcome with wording chosen by the reviewers and the others deleted as duplicates. The remaining outcomes were therefore all different in meaning. Any discrepancies were discussed with a senior researcher (JB). The number of outcomes per trial and the variation of outcomes between trials was recorded.
The time after injury that outcomes were measured were noted separately in order to a) assess the heterogeneity in outcome measurement timing and b) to understand at what stage after injury the effects of the intervention were being assessed. If a single outcome was assessed at different timepoints, all assessment timings were recorded. Data extraction for the timing of outcome reporting from 10% of trials was undertaken independently by another researcher. Timings of outcome assessment were categorised into time periods; < 1 month, > 1 month and < 3 months, > 3 months and < 6 months after injury, > 6 months and < 1 year and > 1 year and < 3 years, and > 3 years. We reported two other outcome time periods; those assessed during acute hospitalisation and during burn wound healing, as these were commonly reported in the literature with no proscribed timepoint. However, it was clear from the reported length of stay and healing data, that all these outcomes were assessed within six months of injury. The frequency of outcomes reported within each time period was recorded.
The data were tabulated in a Microsoft excel spreadsheet, so that each study was listed with study and population details along with outcomes measured. Outcomes were extracted from this spreadsheet into another, with duplicates removed as described above. Outcomes measuring the same healthcare issue but at different timepoints were noted as one outcome for the final set. These final unique outcomes were then grouped into domains.

Classification of outcomes into domains
Outcome domains are groups of similar outcomes. This organisation is necessary, as maintaining a large set of outcomes when a significant number are similar would make any future classification of the outcomes in terms of importance, extremely challenging.
Outcomes were classified into domains in three stages. In stage one, three researchers (two clinically trained burns researchers, one lay burn researcher) independently reviewed the list of outcomes and attributed a domain to each one using their own terms. In stage two, the researchers met to review the domains and agreed i) an appropriate name for each domain, and ii) the appropriate domain for each outcome. Rules for attribution of outcomes to domains were recorded in a coding log to ensure consistency. In stage three, a fourth lay researcher and a patient representative (SB) reviewed the outcomes and their attributed domains to check for clarity of domain name, and that the outcomes under each domain were appropriately attributed. A final meeting with an experienced outcome researcher (JB) was held to finalise outcomes and domains.
The results described below indicate the characteristics of the reported studies and provide detail on heterogeneity of outcome reporting between studies, outcome timepoints and outcome domains.

Patient Involvement:
The need for a burn care Core Outcome Set project was conceived following discussions regarding clinical healthcare Key Performance Indicators with professionals and patients. The patients were vocal about outcomes important to them which they felt were overlooked by professionals, such as pain. The systematic review was discussed at regular project steering group meetings attended by three patients with burns and one parent of a child with burns. A patient with burns is a co-author and was involved with writing and editing of this article as well as with the naming of the outcome domains. Dissemination will be to the lay representatives of the steering group and will inform the Core Outcome Set study in which patients are actively involved.

Included studies and study protocols:
The initial search strategy identified 3,110 studies. Following de-duplication, a total of 2,070 studies remained.
Independent scrutiny of the titles and abstracts identified 306 potentially relevant articles for full text review. Of these, 158 studies did not meet our inclusion criteria and were excluded (PRISMA Flow Diagram Figure 1).
Therefore, a total of 147 studies formed the basis of this study (24,.

Reliability of study selection:
Cohen's Kappa was calculated to report inter-rater reliability of selection at full text stage for both inclusion and reason for exclusion. The results were 0.751 and 0.747 respectively. This is rated as substantial agreement.

Studies:
Of the 147 studies, 86.4% (127) were reports of full RCTs, 8.8% (13) were pilot studies and 4.8% (7) were study protocols.  (2,109) were assessed at less than six months after injury, 16.6% (456) were measured after six months and before three years after injury, and only 5.1% (140) were measured at more than three years after injury ( Figure 2). The timing of outcome measurement was not reported for 38 outcomes.

Outcome domains:
The 990 different clinical outcomes were organised into 54 domains (groups of similar outcomes) (Table 4 and Appendix B).

Discussion
This systematic review aimed to examine outcome reporting in RCTs in burn care. Of the 147 included studies, 1,494 outcomes were identified, with overlap in terminology, inconsistent definitions and variation in time after injury at which the outcomes were measured. Only 30% of the outcomes reported were included in more than one study. There was no single outcome reported in all 147 trials. 77% of outcomes were measured at less than six months after injury. Such heterogeneity of outcome reporting across trials will limit evidence synthesis and mean that research is wasted. The findings in this review have been seen elsewhere in the burns-specific and other clinical literature. A Cochrane review of 30 RCTs concluded that it was impossible to draw conclusions about burn dressing effectiveness, as the trials evaluated a variety of clinical outcomes (18,177). Over the same period as this review, nine Cochrane reviews have had direct relevance to the management of patients with burns (18,(178)(179)(180)(181)(182)(183)(184)(185). None could draw firm conclusions due to methodological issues including heterogeneity of outcome reporting.
Heterogeneity is found in the reporting of outcomes relating to critical care, neurological illness, breast reconstruction surgery, prostate cancer, hip and knee replacement, oesophagectomy surgery, low back pain and in cardiac arrest trials among others (186)(187)(188)(189)(190)(191)(192)(193). Variation in the definitions of outcomes has also been found within published studies of other healthcare areas. A systematic review of 90 papers reporting wound infection after general surgical procedures identified 41 definitions for wound infection itself, including three published by expert groups (194). Similarly, a total of 56 definitions were identified from 97 studies reporting anastomotic leak rates after gastrointestinal surgery despite publication of a standard definition two years before the beginning of the review (195). In this review we identified and agreed the grouping of 990 unique outcomes into 54 outcome domains and it is not always clear how to best categorise outcomes. Williamson published a taxonomy of categorising outcome domains (196) and others have suggested ways of doing this (210). The different approaches demonstrate a lack of agreement on how to optimally categorise outcomes. In burn care the large number and type of clinical domains identified and the variation in timing of assessments further complicates optimal domain classification underlining the need for further work in this area.
A solution to this variation in outcome reporting across trials, is the development of a Core Outcome Set (COS) (21,197,198). A COS is a minimum set of the most important outcomes, agreed and recommended for measurement in all trials for a particular condition (31,32). While not limiting choice, a COS will pre-specify a set of outcomes to ensure consistency of reporting and the ability to collate evidence into systematic reviews by allowing researchers to compare 'like with like' (33). Trials can still select additional outcomes and primary outcomes in addition to the minimum core set. This approach has been shown to improve the consistency of outcome reporting (199,200). Although there is no COS for burn care, work was undertaken in 2008 to agree a set of burn outcome domains (196). However, the work was undertaken by a small group of clinicians, lacked patient involvement and reported little methodological detail (201). Considerable work to develop COS methodology has also been undertaken since this publication (202,203). The COMET (Core Outcome Measures in Effectiveness Trials) Initiative disseminates resources for COS development and supports methodological developments in this area (204,205). COMET recommends a four-step process to develop a COS: a) agreement of the scope, b) assessment of the need, c) development of a protocol and finally d) agreement of the Core Outcome Set (15). This systematic review has satisfied the first two phases for the development of a burn care COS. The final phase encompasses organising a comprehensive long-list of all potential outcomes into domains (of which the clinical domains for burn care are listed in Table 4) and prioritising these domains using a consensus process (206)(207)(208). The strengths of this review are that the protocol and data extraction proforma were pre-specified and the literature search was systematic and comprehensive, including four major healthcare trial databases. To account for multi-disciplinary perspectives, two researchers, two clinicians and a patient were involved in the domain process. It is also novel because it is the first to demonstrate, in detail and using systematic methodology, the scale of the heterogeneity of outcome reporting in burn care research. Limitations include the exclusion of publications in languages other than English. However, international publications were included to reduce the risk of selection bias. The search was also time-limited which may have excluded outcomes from older studies.
The reason for the time limitation was to identify research relevant to modern burn care. The search was also limited to trials reporting clinical outcomes. Other work is in progress to assess patient-reported outcomes in burn care research. This was a review undertaken systematically to a pre-specified protocol. However, a formal quality assessment of studies was not undertaken, as we were researching the reporting of outcomes and not attempting to analyse the effects of interventions. A COS for burn care research would address the issue of heterogeneity of outcome reporting between trials, lead to research that is more likely to measure relevant outcomes, enhance the value of burn care systematic reviews and reduce research waste.

Conclusion
We have shown that multiple different outcomes are reported in trials of burn care interventions. Different definitions are used to assess the same outcome and outcomes are measured at different time points after injury. This heterogeneity and inconsistency in outcome reporting prevents effective evidence synthesis and limits the quality of evidence available for clinical decision-making. Our review demonstrates that until greater consistency is achieved in outcome reporting in trials, it is unlikely that clinicians will be able to synthesise evidence across studies to understand the effects of surgical and non-surgical treatments following burn injury.
It is recommended that a burn care Core Outcome Set is developed to support the effective synthesis of trial data and allow more informed clinical decision-making for the benefit of patients.  Inc. distraction for dressing changes b Inc. levamisole, hyperbaric oxygen, fibroblast growth factor, oral calcium use, ketoconazole, low intensity laser

PRISMA-P (Preferred Reporting Items for Systematic review and Meta-Analysis Protocols) 2015 checklist: recommended items to address in a systematic review protocol* Section and topic
Describe criteria under which study data will be quantitatively synthesised N/A 15b If data are appropriate for quantitative synthesis, describe planned summary measures, methods of handling data and methods of combining data from studies, including any planned exploration of consistency (such as I 2 , Kendall's τ) 15c Describe any proposed additional analyses (such as sensitivity or subgroup analyses, meta-regression) 15d If quantitative synthesis is not appropriate, describe the type of summary planned Meta-bias(es) 16 Specify any planned assessment of meta-bias(es) (such as publication bias across studies, selective reporting within studies) Confidence in cumulative evidence 17 Describe how the strength of the body of evidence will be assessed (such as GRADE)

Strengths and limitations of this study:
 This review is a comprehensive and systematic search for all clinical outcomes reported in randomised controlled trialss of burn care between and including 2012 and 2016.  There is a detailed analysis of all reported outcomes and timing of outcome assessment.  A multi-disciplinary team including a patient were involved in the study.  Quality assessment of studies was not undertaken as the purpose of the review was to extract clinical outcomes alone and not to assess the effect of an intervention.

Introduction
Each year an estimated 2-300,000 people die from burn injuries globally(1). Millions more suffer from burn-related disabilities and disfigurements (2). These injuries have functional, psychological, social and economic effects on survivors and their families. There are multiple strategies for managing burn wounds and associated impact on patient physiology, with new care pathways and technology being introduced on a regular basis (3)(4)(5). The choice of treatment should be made using up-to-date, high quality scientific evidence (6,7). Systematic reviews of randomised controlled trials (RCT) are regarded as the highest quality evidence (8)(9)(10). Despite increasing numbers of published RCTs in burn care, systematic reviews have not provided evidence to support many commonly used interventions or management strategies (11)(12)(13).
A well-designed RCT requires that outcomes are pre-specified. Evidence synthesis requires that these outcomes are consistent across RCTs in the same healthcare area (14). In the context of clinical trials, Williamson et al in the Core Outcome Measures in Effectiveness Trials (COMET) handbook, define an outcome as "a measurement or observation used to capture and assess the effect of treatment such as assessment of side effects (risk) or effectiveness (benefits)" (15). Chan adds a temporal element: "a variable measured at a specific time point to assess the efficacy or harm of an intervention" (16). If RCTs report outcomes that cannot be collated due to differences in choice, definition or timepoint of assessment, evidence synthesis will not be effective or efficient.
There is no agreed minimum set of outcomes important to patients and professionals for reporting in burn care trials and problems with outcome reporting in burn care research have previously been suggested (17)(18)(19).
Pre-specifying outcomes requires research to determine and agree the most important outcomes for a clinical condition. If this is not undertaken, the outcomes reported may not reflect patients' or other stakeholders' needs, outcomes will vary between studies (outcome reporting heterogeneity) and it will be difficult to determine if authors have reported all the outcomes they measured (outcome reporting bias) (20,21). Choosing the most important outcomes to measure in burn care is complex, as patients are a heterogeneous population, with variations in age, mechanism of injury, depth, site and size of burn (22,23). The time frame at which outcomes are measured may also determine the types of outcomes assessed. Outcomes reported in clinical trials during the acute treatment phase include healing time, skin-graft loss, infection rates and NHS costs (24)(25)(26)(27). Longer-term reported outcomes relate to functional, cosmetic and psychological issues (28).  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y   4 To date, there has been no formal investigation into outcome reporting in trials of burn care. The purpose of this study is to examine clinical outcome reporting in burn care research, to consider the types, definitions and timing of outcomes measured and to consider the need for a Core Outcome Set in this field.

Methods
This review is focused on clinical observer-reported outcomes in RCTs assessing the impact of interventions in burn care. It adhered to a pre-specified protocol and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (29). It was registered with the PROSPERO international prospective register of systematic reviews (ID CRD42017060908).

Study Eligibility
Studies were included if they met the following: Types of studies: We included full text RCTs along with RCT protocols and pilot studies. The study design was limited to RCTs, as any final core outcome set will be used for RCT reporting (29). We excluded protocols and pilot studies if the full RCT had been published within the selected time period. We also excluded conference proceedings and abstracts, non-English language publications and studies not involving human subjects.

Types of participants:
We included studies recording outcomes from patients of any age with a cutaneous burn of any type or size, determined by either clinician evaluation or objective assessment, or both, which required treatment in any health care facility. Studies where the population consisted of patients with combined thermal and mechanical injuries were only included if it was possible to separate out the burn care outcomes. Trials studying patients with pure carbon monoxide poisoning, chemical ocular or caustic oesophageal burns were excluded, as the former does not involve a burn and the latter have different aetiology and management to cutaneous burns.

Type of interventions:
Any surgical or non-surgical burn care intervention with any appropriate comparator.
Types of outcomes: defined as the exact terms used in a published trial Abstract, Methods or Results including tables and figures for any observer-reported clinical endpoint. These included physiological, metabolic or adverse or mortality events measured by researchers and relevant to patients' recovery and long-term wellbeing after burn care (30). Trials assessing quality of life were only included if the data were observer-reported. Electronic searches of Ovid MEDLINE, Ovid EMBASE, Web of Science and The Cochrane Library were searched from 1 st January 2012 to December 31st 2016 for RCTs related to burn care using medical subject heading and free text terms including 'burn', 'scald' 'thermal injury' and 'RCT'. This period was chosen so that the outcomes extracted, reflected use in trials relating to modern burn care. Limiting the review to five years allowed us to balance workload against the likelihood of selecting enough trials fulfilling inclusion criteria to demonstrate whether heterogeneity of outcome reporting was present in burn care research. The thesaurus vocabulary of each database was used to adapt the search terms. The search strategy for Ovid MEDLINE is included in a previous publication and in Appendix A (29).

Study selection process
Prior to both abstract and full-text screening, all review authors underwent training to ensure a comparable understanding of the purpose of the review and the eligibility criteria. The reference management software EndNote (Endnote 6, Clarivate Analytics, Boston; available at http://endnote.com/) was used to compile all titles derived from the initial searches, with duplicates removed for the review authors to screen titles and abstracts against the eligibility criteria. Screening of titles and abstracts was completed independently, then in duplicate by two authors (AY, AD) with experience in systematic review methodology. All screening disagreements were discussed, with any outstanding disagreements resolved by an independent reviewer (JB). Any studies appearing to meet the inclusion criteria based on the abstract were retrieved as full-text articles. Two reviewers then read the full-text articles in their entirety to assess for eligibility, with decisions on inclusion and exclusion recorded ( Figure 1 for flow diagram). Reasons for exclusion were ordered hierarchically (Table 1) and applied to each full text. The highest reason for exclusion met by a paper was recorded as its reason for exclusion. Any disagreements were discussed with another author (JB).

Quality Assessment.
The aim of this study was to comprehensively document any variation in clinical outcomes selected, defined, measured and reported in burn care RCTs and not to synthesise data about the effect of interventions. Inclusion of all trials was necessary to demonstrate if a variation in outcome reporting was present across trials, regardless of quality of methodology of the trial. We therefore decided not to undertake a quality assessment of studies because it was not relevant to the data being recorded in this review; simply the nature and description of the unique outcomes reported in each study. Data were extracted into a standardised data extraction sheet (Microsoft Excel). This included study author, country or countries recruiting (categorised into the United Nations six regions (31)), publication year, number of sites and number of participants recruited per trial, design (full RCT, pilot, protocol) and intervention tested. For protocols, the planned participant inclusion criteria and sample size were extracted.

Data Extraction
No distinction was made between primary or secondary outcomes, although this was noted and is part of a separate project. All outcomes were extracted verbatim, with 20% of the extracted data verified by a second reviewer. True duplicates, spelled and worded the same, were deleted. As a second process, two reviewers (a clinician and researcher) discussed all verbatim outcomes to assess duplicates in meaning but spelled or worded in a slightly different manner; such as length of time in hospital and number of days in hospital, platelet level and levels, and serum IL-10 and IL-10 in blood. These were named as one outcome with wording chosen by the reviewers and the others deleted as duplicates. The remaining outcomes were therefore all different in meaning.
Any discrepancies were discussed with a senior researcher (JB). The number of outcomes per trial and the variation of outcomes between trials was recorded.
The time after injury that outcomes were measured were noted separately in order to a) assess the heterogeneity in outcome measurement timing and b) to understand at what stage after injury the effects of the intervention were being assessed. If a single outcome was assessed at different timepoints, all assessment timings were recorded. Data extraction for the timing of outcome reporting from 10% of trials was undertaken independently by another researcher. Timings of outcome assessment were categorised into time periods; < 1 month, > 1 month and < 3 months, > 3 months and < 6 months after injury, > 6 months and < 1 year and > 1 year and < 3 years, and > 3 years. We reported two other outcome time periods; those assessed during acute hospitalisation and during burn wound healing, as these were commonly reported in the literature with no proscribed timepoint. However, it was clear from the reported length of stay and healing data, that all these outcomes were assessed within six months of injury. The frequency of outcomes reported within each time period was recorded.
The data were tabulated so that each study was listed with study and population details along with outcomes measured. Outcomes were extracted from this spreadsheet into another, with duplicates removed as described above. Outcomes measuring the same healthcare issue but at different timepoints were noted as one outcome for the final set. These final unique outcomes were then grouped into domains.

Classification of outcomes into domains
Outcome domains are groups of similar outcomes. This organisation is necessary, as maintaining a large set of outcomes when a significant number are similar, would make any future classification of the outcomes in terms of importance, extremely challenging.  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y   7 Outcomes were classified into domains in a three-stage iterative approach. In stage one, four researchers (a clinically trained burn care researcher, a burn research associate and two senior research nurses experienced in burn care) independently reviewed the list of outcomes and attributed a potential domain to each one using their own terms. In stage two, the researchers met to review the domains and agreed i) appropriate groupings of outcomes into domains and ii) an appropriate name for each domain. Rules for attribution of outcomes to domains were recorded in a coding log to ensure consistency. In stage three, a patient representative reviewed the outcomes and their attributed domains to check for clarity of domain name, and that the outcomes under each domain were appropriately attributed. A final meeting with an experienced outcome researcher was held to finalise outcomes and domains. The use of a published classification system was not undertaken as none appeared to allow the flexibility or fit to the types of outcomes reported in burn care trials (32,33).
The results described below indicate the characteristics of the reported studies and provide detail on heterogeneity of outcome reporting between studies, outcome definitions, timepoints and outcome domains.

Patient Involvement:
The need for a burn care Core Outcome Set project was conceived following discussions regarding clinical healthcare Key Performance Indicators with professionals and patients. The patients were vocal about outcomes important to them which they felt were overlooked by professionals, such as pain. The systematic review was discussed at regular project steering group meetings attended by three patients with burns and one parent of a child with burns. A patient with burns is a co-author and was involved with writing and editing of this article as well as with the naming of the outcome domains. Dissemination will be to the lay representatives of the steering group and will inform the Core Outcome Set study in which patients are actively involved.

Included studies and study protocols:
The initial search strategy identified 3,110 studies. Following de-duplication, a total of 2,070 studies remained.
Independent scrutiny of the titles and abstracts identified 306 potentially relevant articles for full text review. Of these, 158 studies did not meet our inclusion criteria and were excluded (PRISMA Flow Diagram Figure 1). Therefore, a total of 147 studies formed the basis of this study (24,.

Outcomes:
1,494 clinical outcomes were reported of which, after de-duplication, 955 different, unique outcomes remained.
Of the 1,494 outcomes reported, 27.7% (421) were common across two studies or more. Of these outcomes, 50.3% (78) appear in only two trials and 84.5% appear in five trials or fewer. The number of outcomes reported per trial varied from one to 37 (median 9; IQR 8) (Table 4). No single outcome was reported across all 147 studies.
Outcome definition variation: Outcomes assessing the same healthcare issue were commonly defined differently.
An example is burn wound healing which was defined in 166 different ways. Examples include: healing percentage at specified timepoints, incidence of complete wound healing, incidence of 30% wound healing and length of time until 50% epithelialisation of burn wound. Similar variation in definition of burn wound infection existed with 79 unique outcome definitions including: bacterial colonisation of burn wound, days of antibiotics, incidence of local infection, incidence of positive wound cultures, peri-wound redness, rate of bacterial clearance from wound and number of inflammatory cells in the wound.
Outcome timing variation: There were 2,743 outcomes measured if the same outcome measured at different timepoints across all the 147 RCTs are included; e.g. size of burn wound measured at one week and again at two weeks, were recorded as different outcomes for this exercise. Of these, 76.9% (2,109) were assessed at less than six months after injury, 16.6% (456) were measured after six months and before three years after injury, and only 5.1% (140) were measured at more than three years after injury (Figure 2). The timing of outcome measurement was not reported for 38 outcomes.

Discussion
This systematic review aimed to examine outcome reporting in RCTs in burn care. Of the 147 included studies, 1,494 outcomes were identified, with overlap in terminology, inconsistent definitions and variation in time after injury at which the outcomes were measured. Only 30% of the outcomes reported were included in more than one study. There was no single outcome reported across all 147 trials. Commonly-reported outcomes were defined differently between trials, such as burn wound healing which was defined in 166 different ways. Such heterogeneity of outcome reporting across trials will limit evidence synthesis and result in research wastage.
The findings in this review have been seen elsewhere in the burns-specific and other clinical literature. A Cochrane review of 30 RCTs concluded that it was impossible to draw conclusions about burn dressing effectiveness, as the trials evaluated a variety of clinical outcomes (18,179). Over the same period as this review, nine Cochrane reviews have had direct relevance to the management of patients with burns (18,(180)(181)(182)(183)(184)(185)(186)(187). None could draw firm conclusions due to methodological issues including heterogeneity of outcome reporting.
Heterogeneity is found in the reporting of outcomes relating to critical care, neurological illness, breast reconstruction surgery, prostate cancer, hip and knee replacement, oesophagectomy surgery, low back pain and in cardiac arrest trials among others (188)(189)(190)(191)(192)(193)(194)(195). Variation in the definitions of outcomes has also been found within published studies of other healthcare areas. A systematic review of 90 papers reporting wound infection after general surgical procedures identified 41 definitions for wound infection itself, including three published by expert groups (196). Similarly, a total of 56 definitions were identified from 97 studies reporting anastomotic leak rates after gastrointestinal surgery despite publication of a standard definition two years before the beginning of the review (197).
In this review we identified and agreed the grouping of the 955 unique outcomes into 54 outcome domains.
There is no agreement between COS reviewers about how best to classify outcomes into domains. Williamson published a taxonomy of categorising outcome domains (198). Other authors have suggested different ways of doing this, all addressing different needs (32,33,199)). In the Williamson taxonomy, the authors state that of 99 COS studies, 21 applied their own approach to outcome classification and only six used an existing system. As we had identified a large number of different clinical burn outcomes and as the outcomes we extracted did not clearly fall within the Williamson taxonomy, we decided to use our own approach to domain classification. We used five multi-disciplinary researchers and a patient working independently, and subsequently together, to bring different views and as little bias as possible to the process.  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y 10 A solution to the above described variation in outcome reporting across trials, is the development of a Core Outcome Set (COS) (21,200,201). A COS is a minimum set of the most important outcomes, agreed and recommended for measurement in all trials for a particular condition (31,32). While not limiting choice, a COS will pre-specify a set of outcomes to ensure consistency of reporting and the ability to collate evidence into systematic reviews by allowing researchers to compare 'like with like' (33). Trials can still select additional outcomes and primary outcomes in addition to the minimum core set. This approach has been shown to improve the consistency of outcome reporting (202,203). Although there is no COS for burn care, work was undertaken in 2008 to agree a set of burn outcome domains (198). However, the work was undertaken by a small group of clinicians, lacked patient involvement and reported little methodological detail (204).
Considerable work to develop COS methodology has also been undertaken since this publication (205,206). The  Table 5) and prioritising these domains using a consensus process (209)(210)(211).
The strengths of this review are that the protocol and data extraction proforma were pre-specified and the literature search was systematic and comprehensive, including four major healthcare trial databases. To account for multi-disciplinary perspectives, two researchers, two clinicians and a patient were involved in the domain process. It is also novel because it is the first to demonstrate, in detail and using systematic methodology, the scale of the heterogeneity of outcome reporting in burn care research. Limitations include the exclusion of publications in languages other than English. However, international publications were included to reduce the risk of selection bias. The search was also time-limited which may have excluded outcomes from older studies.
The reason for the time limitation was to identify research relevant to modern burn care. The search was also limited to trials reporting clinical outcomes. Other work is in progress to assess patient-reported outcomes in burn care research. This was a review undertaken systematically to a pre-specified protocol. However, a formal quality assessment of studies was not undertaken, as we were researching the reporting of outcomes and not attempting to analyse the effects of interventions. A COS for burn care research would address the issue of heterogeneity of outcome reporting between trials, lead to research that is more likely to measure relevant outcomes, enhance the value of burn care systematic reviews and reduce research waste.

Conclusion
We have shown that multiple different unique outcomes are reported in trials of burn care interventions.

4,5
Information sources 7 Describe all information sources (e.g., databases with dates of coverage, contact with study authors to identify additional studies) in the search and date last searched.

4,5 Appendix A
Search 8 Present full electronic search strategy for at least one database, including any limits used, such that it could be repeated.
Appendix A Study selection 9 State the process for selecting studies (i.e., screening, eligibility, included in systematic review, and, if applicable, included in the meta-analysis).

5
Data collection process 10 Describe method of data extraction from reports (e.g., piloted forms, independently, in duplicate) and any processes for obtaining and confirming data from investigators.

5,6
Data items 11 List and define all variables for which data were sought (e.g., PICOS, funding sources) and any assumptions and simplifications made.

5,6
Risk of bias in individual studies 12 Describe methods used for assessing risk of bias of individual studies (including specification of whether this was done at the study or outcome level), and how this information is to be used in any data synthesis.

Not done. Justification p5
Summary measures 13 State the principal summary measures (e.g., risk ratio, difference in means).

Study selection
17 Give numbers of studies screened, assessed for eligibility, and included in the review, with reasons for exclusions at each stage, ideally with a flow diagram. Figure 1, Table 2, p7 Study characteristics 18 For each study, present characteristics for which data were extracted (e.g., study size, PICOS, follow-up period) and provide the citations.

7,8
Risk of bias within studies 19 Present data on risk of bias of each study and, if available, any outcome level assessment (see item 12).

N/A
Results of individual studies 20 For all outcomes considered (benefits or harms), present, for each study: (a) simple summary data for each intervention group (b) effect estimates and confidence intervals, ideally with a forest plot. Not done. Justification p5 Additional analysis 23 Give results of additional analyses, if done (e.g., sensitivity or subgroup analyses, meta-regression [see Item 16]).

DISCUSSION
Summary of evidence 24 Summarize the main findings including the strength of evidence for each main outcome; consider their relevance to key groups (e.g., healthcare providers, users, and policy makers).