Developing a core outcome set for interventions to improve discharge from mental health inpatient services: a survey, Delphi and consensus meeting with key stakeholder groups.

OBJECTIVE
To develop a core set of outcomes to be used in all future studies into discharge from acute mental health services to increase homogeneity of outcome reporting.


DESIGN
We used a cross-sectional online survey with qualitative responses to derive a comprehensive list of outcomes, followed by two online Delphi rounds and a face-to-face consensus meeting.


SETTING
The setting the core outcome set applies to is acute adult mental health.


PARTICIPANTS
Participants were recruited from five stakeholder groups: service users, families and carers, researchers, healthcare professionals and policymakers.


INTERVENTIONS
The core outcome set is intended for all interventions that aim to improve discharge from acute mental health services to the community.


RESULTS
Ninety-three participants in total completed the questionnaire, 69 in Delphi round 1 and 68 in round 2, with relatively even representation of groups. Eleven participants attended the consensus meeting. Service users, healthcare professionals, researchers, carers/families and end-users of research agreed on a four-item core outcome set: readmission, suicide completed, service user-reported psychological distress and quality of life.


CONCLUSION
Implementation of the core outcome set in future trials research will provide a framework to achieve standardisation, facilitate selection of outcome measures, allow between-study comparisons and ultimately enhance the relevance of trial or research findings to healthcare professionals, researchers, policymakers and service users.

I, the Submitting Author has the right to grant and does grant on behalf of all authors of the Work (as defined in the below author licence), an exclusive licence and/or a non-exclusive licence for contributions from authors who are: i) UK Crown employees; ii) where BMJ has agreed a CC-BY licence shall apply, and/or iii) in accordance with the terms applicable for US Federal Government officers or employees acting as part of their official duties; on a worldwide, perpetual, irrevocable, royalty-free basis to BMJ Publishing Group Ltd ("BMJ") its licensees and where the relevant Journal is co-owned by BMJ to the co-owners of the Journal, to publish the Work in this journal and any other BMJ products and to exploit all rights, as set out in our licence.
The Submitting Author accepts and understands that any supply made under these terms is made by BMJ to the Submitting Author unless you are acting as an employee on behalf of your employer or a postgraduate student of an affiliated institution which is paying any applicable article publishing charge ("APC") for Open Access articles. Where the Submitting Author wishes to make the Work available on an Open Access basis (and intends to pay the relevant APC), the terms of reuse of such Open Access shall be governed by a Creative Commons licence -details of these licences and which Creative Commons licence will apply to this Work are set out in our licence referred to above.
Other than as permitted in any relevant BMJ Author's Self Archiving Policies, I confirm this Work has not been accepted for publication elsewhere, is not being considered for publication elsewhere and does not duplicate material already published. I confirm all authors consent to publication of this Work and authorise the granting of this licence.

Competing Interests
The authors report no competing interests.

Key Words
 Acute Adult Mental Health Services  Core Outcome Set  Mental Health  Discharge  Care Transitions

Background
Care transitions (when patient care is transferred from one team, department or organisation to another) are widely recognised as a vulnerable and high-risk stage in the care pathway [1][2][3]. Safety issues may be intensified in acute mental health services, where care transitions are described as chaotic [3]. For example, suicide risk increases post-discharge from acute mental health services [4,5]. A growing body of research describes these risks either directly in terms of identified 'safety' events or indirectly in terms of broader 'problems', including for example treatment non-adherence, inappropriate readmissions, increased risk of self-injury or suicide attempts [3,[6][7][8].
Internationally, researchers have attempted to find solutions to the problems or threats to safety associated with discharge from acute mental health services by developing interventions that aim to improve different aspects of discharge planning, transitions, continuity of care, and follow-up care. There are various types of discharge intervention presented in the literature [9]. Some interventions aim to improve discharge by introducing new roles, for example a discharge co-ordinator, that co-ordinates care and provides and single point of contact to help navigate the complex system [10]. Others focus on increasing contact between clinical staff and service users, for example using letters, videoconferencing or telephone follow-up [11][12][13]. Many 'successful' interventions in reducing readmission, bridged the boundaries between ward and community by providing types of ward-based care in the community [14,15] or where community teams lead discharge planning on the wards [16], with a focus on the early development of therapeutic relationships. Other successful interventions aim to solve a particular problem for a smaller group of acute service users, for example those at risk of homelessness, by providing financial and social support [17,18].
There has been little attempt to compare these diverse interventions. Existing reviews have included either a narrow range of studies addressing a single outcome or focus on a specific  [8,19]. Comparison and meta-synthesis of effectiveness of interventions has reported limited success. Across the papers included in our systematic review and those by other researchers [1,20], variation in the outcomes reported is substantial. This limits between study comparability and delays advancement in evidence collection. Furthermore, outcomes in these trials were not necessarily representative of the measures that service users would consider important at discharge. Both matters can potentially be addressed with the development of a 'core outcome set', defined as "an agreed, standardised collection of outcomes which should be measured and reported, as a minimum, in all trials for a specific clinical area" [21].
The development and use of 'core outcome sets' has been endorsed as a means to reduce outcome heterogeneity in research, and to increase the relevance of research through the involvement of key stakeholders in its development [22]. There is an emerging body of literature highlighting the difficulties of defining and assessing outcomes in a mental health population [23]. There is also evidence of a lack of agreement amongst key groups about what should be measured and in what capacity and an evident tension between the population health perspective and provision of individualised care [19,23]. One aforementioned previous review identified the need for consensus on outcome definitions in discharge planning interventions [19], similarly a recent Kings Fund report suggested broader consensus upon the outcomes that matter is imperative for advancement [23]. Therefore, generating agreement amongst healthcare professionals, service users, policy makers and researchers is a difficult but imperative task, to enable the useful direction of healthcare services [23]. The difficulties are further exemplified when applied to care transitions, a multi-agency, multi-stage, complex period of the care pathway [3,24]. This paper outlines the development of a core outcome set for research of interventions to improve discharge from acute mental health wards to the community.

Study overview
The scope of the core outcome set was defined according to the criteria recommended by Core Outcome Measures in Effectiveness Trials (COMET) [25]. The health condition was functional conditions (mental disorders other than dementia, and includes severe mental illness such as schizophrenia). The population was adults aged 18-65, the intervention was any interventions that aimed to improve discharge from an acute mental health setting to the community. The core outcome set was developed using four stages, including service users and healthcare professionals at each stage: (1) a long list of outcomes was generated through systematic review [1] and qualitative survey; (2) the resulting long outcome list was used to populate an online Delphi process; and (3) the results of the Delphi survey were appraised at a consensus meeting and the final core outcome set was established. After the development of the core outcome set a final stage (4), engaged stakeholders to make recommendations for the measurement of the core outcomes. The process included a series of core research team meetings at every stage, the team comprised of a researcher and core outcome set developer, an associate professor in mental health and mental health nurse, a researcher and expert by lived experience of acute services and an expert in patient safety. For the online questionnaires, groups of participants were recruited in various ways in December 2018 to January 2019. Academic researchers were recruited if their research had been included in our systematic review or if they were known researchers in the field identified by the team. End-users of research (policy makers, NGOs, NHS management, commissioners, advocates etc.) were recruited via searching for publicly available contact details or using our team's professional networks or social media. Service users and healthcare professionals were recruited through social media. Twitter was nominated as the primary platform for distribution due to its ability to reach into the specific communities of interest we required: mental health professionals, service users and families/carers. Using social media has been reported as a cost-effective and efficient way to recruit those from potentially stigmatised groups [26]. Further, the peer network structures of social media platforms enable users to recruit other users through sharing links within their networks. In order to reduce attrition in round 3 those who dropped out after the first questionnaire were invited to re-join the panel.

Participants
Participants did not fit into distinct homogeneous groups, for example mental health professionals were sometimes also past service users or family members of service users. Similarly researchers had personal experience of inpatient mental health services. Therefore, wherever possible we considered the group as whole and tried not to compare categories.
For the consensus meeting, UK participants were asked in the final round to indicate whether they would be interested in a face-to-face meeting. We invited a random sample of interested participants to attend, that ensured representative of the stakeholder groups. If a participant declined the invite a similarly matched participant was invited.

Stage 1: Gathering information
In addition to the outcomes extracted from the systematic review [1], outcomes of importance to service users, health care professionals, families, researchers and end-users of research were identified through qualitative surveys. Additional file 1, outlines the questions asked to each group, if an individual was associated with numerous stakeholder groups they would be presented with question sets relevant to each group. Informed consent was obtained before questionnaires were answered. Outcomes were identified both indirectly, by extrapolating from service users' experiences (e.g. What would make discharge from an acute mental health ward safe in your opinion?), and directly, by asking specifically about outcomes (e.g. Can you think of any important outcomes to measure in research assessing discharge interventions?). . We used open questions that were developed to elicit potential additional outcomes. The questions were loosely modelled on questions developed for a large scale outcome generation study for a depression core outcome set which were developed with service users and healthcare professionals [27]. The question format was mirrored for a mental health discharge theme and the views of a PPI (patient and public involvement, n=5) group sought to confirm appropriateness of questions and instructions.
Qualitative data was coded to identify outcomes and thematically synthesised [28]. This involved line-by-coding of text and development of descriptive themes, the final stage involved generating analytical themes, which were converted into potential outcomes where applicable. Outcomes from the systematic review [1] and qualitative surveys were combined to generate a long list of outcomes. This list, along with relevant quotes from the qualitative data, was discussed by the core research team in a structured meeting. For each outcome, the group decided whether it should be a stand-alone outcome, combined with other codes of a F o r p e e r r e v i e w o n l y similar thematic nature or removed from the process due to being of limited importance for a core outcome set. For example, we agreed to merge closely related items (e.g. family relations and quality of interpersonal relationships) and to exclude outcomes considered to be of limited importance (e.g., too specific to a particular group: Autistic life; or intervention Antipsychotic Politherapy). Unless there was a unanimous decision to merge or remove an outcome, it remained as a stand-alone outcome. The group decisions about each outcome are documented in additional file 2. The final outcome list was used to populate the Delphi questionnaire. The outcomes list and instructions for the questionnaires were reviewed for face validity, understanding, and acceptability by a PPI group and modified according to feedback. The round was open for 6 weeks beginning December 10 th 2018.

Stage 2: Delphi survey
We ran the Delphi survey manually using Qualtrics: a secure online hosting platform [29]. Only participants that responded to the questionnaire in stage 1, were invited to take part in first round of the Delphi (stage 2). The Delphi process was conducted across two rounds. In each round, participants were asked whether the items should become part of a core outcome set on a 1-7 scale described as: Strongly Agree (1), Agree (2), Slightly Agree (3), Neither Agree nor disagree (4), Slightly Disagree (5), Disagree (6), Strongly Disagree (7). There was a free-text comments box and participants were encouraged to provide comments that would be fed-back anonymously to the group. Participants could suggest additional outcomes at the end of round one, which were reviewed by the core research team. Any outcome not already represented was added to round two. A link to the survey was sent via email. Each round remained open for 14 days and participants received two follow-up reminder emails. Round one was open from late February 2019 to early March 2019, round two was late March to early April.
In round two median group scores for each outcome and anonymous comments for and against from the previous round were presented and participants were asked to reflect on the information presented and score each outcome again. The percentage of participant agreement with each outcome on a scale of 1-7 was calculated from the scores obtained during round one and again in round two.
Literature suggests that consensus levels should be set a priori at a minimum of 70 percent [25,30]. We unanimously chose a 75% consensus level, slightly higher than the minimum to increase sensitivity, but to still allow for a varied pool of applicable outcomes given the tension in the literature around disagreement between service-user, health professional and policy-makers opinions of mental health outcomes [23]. Consensus criteria were defined a priori: outcomes scored as Agree or Strongly Agree (1-2) by 75% or more of the group reached consensus for inclusion and were included in the provisional core outcome set. Outcomes scored as Disagree or Strongly Disagree (6-7) by 75% or more were defined as having reached consensus for exclusion and were excluded. Outcomes not fulfilling criteria for consensus inclusion or exclusion were defined as not having reached consensus and were re-presented in round two.
As no outcomes met the original criteria for having reached consensus for exclusion after round one, it was agreed by the research team to redefine the criteria for having reached consensus for exclusion if 50% or less of participants scored the item as Strongly Agree or Agree (6)(7). Reducing exclusion criteria after round one has been used effectively in past core outcome set research [30]. The results of the Delphi survey were presented at a consensus meeting, the main goal of the consensus meeting was to decide which items will be included in the final core outcome set. The format of the consensus meeting comprised of a short overview of the study, a summary of the results of how each stakeholder group had scored each outcome, beginning with the outcomes that met consensus [31]. This was chaired by an independent researcher, with expertise in consensus methodology, and who was not a member of the core research team. Participants were sampled to achieve a balanced representation of service users, health-care professionals, researchers and end-users of research. We aimed to have a small representative group of between 9 and 12 to enable meaningful small group discussions, similar to consensus meetings chaired by the facilitator in other fields [32,33] . International participation was restricted because of budgetary constraints. Outcomes identified in round one and two of the Delphi as having reached consensus for inclusion were presented and participants were asked if there were any fundamental reasons why these should not be included in the core outcome set. Divergent views were actively sought and the chair ensured everyone had opportunity to participate in discussions before voting commenced. Outcomes from the preliminary core outcome set were discussed in terms of feasibility and voted upon. Voting was conducted anonymously using cards in an envelope with bivariate response options (include/exclude). Voting and consensus criteria followed the same format as in the Delphi (75% for inclusion). Results were presented after the voting of all outcomes had finished. Outcomes deemed to be having reached consensus for exclusion or with no consensus in the Delphi were reviewed and participants were asked if there were any fundamental reasons why these should be included in the core outcome set, with individual outcomes being discussed and voted on, only if proposed as being important by a meeting participant. Outcomes meeting criteria for consensus were included in the core outcome set; all other items excluded. The meeting finished with the presentation and ratification (a final review and discussion) of the four-item core outcome set.

Stage 4: Preliminary Measurement Recommendations
After the core outcome set was agreed in the consensus meeting, we invited all participants from the three rounds to recommend measures and time markers in a final online questionnaire. Participants were invited to participate if they had been involved in any of the previous online rounds. The invitation made it clear that the questionnaire is most relevant to researchers, but that other groups with an opinion or interest are welcome to contribute. This was due to the specific knowledge of instruments required to complete this round.
In this questionnaire participants were presented with the four core outcomes. For each core outcome they were presented with any measures used to assess that outcome in our systematic review studies [1] and any additional measures that had been recommended to the team during the process. Participants were asked to choose the one most appropriate, (don't know, other, new instrument, no instrument were also options). A second question also asked which time markers would be recommended, with options to select all applicable. These options were also developed based on time markers used in the systematic review [1]. Our findings are reported in line with the Core Outcome Set-Standards for Reporting (COS-STAR) guidance [31]. The study was prospectively registered with the COMET initiative. The study was approved by the University of Nottingham Business School ethics committee.

Role of the funding source
The funding source was not involved in study design; in the collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the paper for publication

Stage 1: Information gathering
Our systematic review has been described in detail elsewhere [1]. In summary 69 outcome categories were identified from 45 studies. Ninety-three participants from 12 countries completed the information gathering questionnaire, including 27 identified as service users, 17 family/carers, 39 health-care professionals, 15 end-user of research and 37 researchers. As noted above, many chose multiple categories. Additional file 3 presents participants demographics. Qualitative questionnaires revealed an additional 45 outcomes that were not identified in the literature (for example, outcomes concerning involvement in discharge planning, see additional file 2). After discussion within the research team, 82 standardised outcome terms were taken forward into the Delphi process; 19 outcomes were combined/collapsed and 13 were removed, see additional file 2.

Stage 2: Delphi process
Sixty-nine participants completed round one of the Delphi (22 service users, families and carers, 26 researchers and 21 healthcare professionals and decision makers) and 68 participants completed round two (30 researchers, 18 service users and families and 20 healthcare professionals and decision makers). Whilst 5 participants dropped out after round one, 4 new participants joined the panel in round two. Attrition rate from first questionnaire to round one of the Delphi was 25.8%, there was 1.4% attrition between round one and two of the Delphi. Seven additional outcomes were proposed during round one, of which two were added into round two after a core team discussion. The full list of Delphi items is available in additional file 4.
After round one, 14 outcomes met the criteria for consensus inclusion, these were: Service user involvement in discharge planning; Functioning; Mental health and illness; Personal Recovery; Service user understanding of discharge plan; Quality of life; Suicide Completed; Readmission; Service user involvement in decision making; Service user satisfaction with information provision at discharge; Service user knowledge of how to access community support; Recurrence; Suicide Attempted; Discharge to appropriate accommodation (see table  1). Twenty outcomes met the revised criteria for having reached consensus for exclusion. Forty-eight outcomes did not meet consensus criteria and were re-presented to the group in round two. Therefore, 50 outcomes were presented in round two, only one outcome met the criteria for consensus after this round: meaningful activity. No outcomes met criteria for exclusion and 49 did not meet consensus. Additional file 4 shows consensus levels for each outcome in each round.  Eleven participants attended the consensus meeting, as in previous rounds these categories were not exclusive, six participants were researchers, three identified as service users, three as healthcare professionals and three end-users of research, see table 2. Table 3 shows the quantitative results of the meeting.
The preliminary 15-item core outcome set was considered individually and discussions indicated that many of the outcomes were elements of an ideal discharge, and process outcomes/variables, but probably not measurable outcomes that should be included in a core outcome set. After these discussions and independent and anonymous voting, five items no longer met consensus criteria for inclusion. First, 'service user involvement' in discharge planning, and the associated items 'Service user understanding of discharge plan', 'Service user involvement in decision making', 'Service user satisfaction with information provision at discharge'; and 'Service user knowledge of how to access community support'. There was a tremendous agreement amongst the group that these are very important elements of a successful discharge, but not core outcomes due to issues surrounding validity and meaning.
In the discussion we present these five outcomes as a potential additional important selfreported measure of service user satisfaction and involvement.
'Mental Health and illness' was initially close to consensus with 73% consensus to include, however those that chose to exclude found it to be too vague, and articulated that they were most interested in measuring acute psychological distress, rather than mental health and illness. The service user representatives in the group interpreted "recovery" to mean a complete amelioration of symptoms and even when in "recovery" individuals described continuing to experience distress and difficulties with their mental health. We chose to therefore separate the broader mental health and illness outcome into self-reported psychological distress and clinician reported mental health. The granular outcome of selfreported psychological distress resulted in 100% consensus to include. On the contrary clinician reported mental health did not meet consensus criteria (45%). Similar discussions happened around the recurrence (relapse) outcome, whereby its inclusion in a core outcome set, would ultimately necessitate buy-in to criteria model, which suggested that mental health problems could and should be completely resolved.
Discussions around the 'suicide attempted' outcome indicated that participants felt that suicide attempts or self-harm had diverse motivations and definitions and they discussed the issues of delineating the boundaries of self-harm and suicide attempts and how this is documented. After the consensus meeting this outcome no-longer meet consensus criteria to include. Discussions surrounding personal recovery, functioning and meaningful activity indicated that participants considered these outcomes too vague and subjective to be a component of a core outcome set. There was consensus to exclude meaningful activity and recovery, and no consensus to include personal recovery. There was consensus to exclude On completion of the meeting, only four outcomes met consensus criteria for inclusion, see figure 1. A core outcome set of four was ratified and participants agreed that the following should be included: readmission, quality of life, suicide completed and service user reported psychological distress. Readmission was the most frequently used outcome in past research, and despite limitations, participants felt it was one of the only proxy measures of appropriate discharge. Quality of life and psychological distress were considered important ways of quantitatively assessing the psycho-social elements of discharge; which are of primary importance. Suicide completed was considered rare but imperative data to capture given the research highlighting the relationship between acute mental health discharge and suicide highlighted by a growing body of literature [5,34]. Figure 2 shows the process undertaken to reach the core outcome set.

Stage 4: Preliminary Measurement Recommendations
Forty-three of the 93 invited participants responded in the final round (15 service users, 8 family members/carers, 23 researchers, 10 healthcare professionals, 3 end users of research), although as in previous rounds these were not distinct categories. Fifty-three percent of the respondents were researchers, this was expected as in the email we suggested that this stage may be more meaningful or of interest to this group, but as a team we chose not to exclude other groups with opinions on measurement instruments. Twenty-three percent of participants were international researchers (from USA, Switzerland, Canada, and Australia). Table 4 shows the preliminary minimum measure recommendations and time markers, additional file 5 shows the results upon which the recommendations were based. One month post-discharge Quality of Life ReQoL-10 One month post-discharge

Readmission
A minimum recommendation of using retrospective review of administrative data for readmissions within a defined time period, the most agreed was 28 days. Participants indicated that routine data collection might cover slightly different time periods. Twentysix of the 43 participants recommended a measure of around a month (1 month, 30 days or 28 days) with 28 days being this most popular. However, they also advise that this should be supplemented either by cross-checking with service-users, case managers or carers, where possible to improve quality of data. Those looking for more comprehensive data may also like to record 7 days, 3 months and 6 months as these were also popular recommendations.

Quality of Life
The participants recommended that researchers use the Recovering Quality of Life (ReQoL-10) at one month post-discharge [35]. This was the most recommended instrument by the group. However, many participants also voted for ReQoL 20, a large proportion of the group suggested this outcome. As this is a quality of life measure specific to mental health recovery they felt this is most appropriate. The one month time marker is in-keeping with the other COS time frames, making a more comprehensive and accessible core outcome set. Those using within-participant measures of quality of life may like to also measure a pre-discharge baseline. Researchers looking for more thorough assessment of quality of life may like to also measure at 7 days post-discharge and 3 months, as these were also highly recommended time markers or use the ReQoL 20 and report both scores for comparability.

Suicide Completed
The participants recommended retrospective review of administrative data, for suicide completed within 28 days of discharge. Retrospective review is in line with other outcomes and was marginally the highest suggestion. We chose within 28 days for consistency with readmission data. Researchers looking for more comprehensive data may want to use 7 days and 3 months as these were highly recommended also. They may also want to cross-check this information against other sources (carers/case managers) to ensure it is correct and reported, particularly as participants mentioned the impact of incorrect coroner's reports on such data.

Psychological Distress
The participants recommended Kessler Psychological Distress (K10) one month postdischarge [36] . For consistency with other outcomes we recommend measure at one month. Seven days and 3 months are also highly recommended, so we would recommend these for research that is more robust. Although there were very few votes for instruments for psychological distress and qualitative comments revealed that participants felt this is not measurable. The same amount of people who voted for K10 also voted for interviews or other measures and a similar amount recommended the development of a new measure, CORE-10 was similarly close [37]. Whilst we make this recommendation, we also suggest that future researchers may look to develop something specific for Psychological Distress in this core outcome set. Interviews would not effectively facilitate the between study comparison, the key purpose of a COS.

Discussion
This study provides the first international consensus on outcomes for intervention studies concerning discharge from an acute adult mental health inpatient setting. We could not identify any other published core outcome sets for interventions concerning discharge from acute mental health services. Moreover, there are very few core outcome sets for mental health, despite recommendations for consensus in the literature [20,23]. All the included outcomes were identified as critically important by more than 75% of a group of relatively equally-represented service-users and family/carers, health-care professionals, researchers and end-user of research using consensus methods. We recommend that all future research studies evaluating interventions for discharge from acute adult mental health settings use this core outcome set as a framework for outcome selection, to compliment, rather than replace any other outcomes that are relevant to their research question. We suggest that researchers can and should chose other outcomes related to their own research question in addition to these four items. As discharge from acute services is a particularly challenging period for those experiencing mental health problems [3,34], it's important to understand what interventions work and more specifically which elements of an intervention improve which particular outcomes. This core outcome set provides a framework for between-study comparison, ultimately enabling researchers to articulate the theory of change that underpins interventions.
In our systematic review [1], we identified 22 studies that reported readmission rates as an outcome, yet almost all of them captured this in different ways: some used self-report data, some clinical case notes or some retrospective administrative data, others used case manager's reports. In addition the time markers were variable, some used country specific time markers in line with policy such as 28 days in the UK [38], whilst others chose a series of time markers such as within 1 month, 3 month and 6 months, but the time markers were rarely directly comparable. Similarly, six studies measured quality of life but, only two used the same measurement instrument (Lehman's Quality of Life) [16,39]. In the current study, we have developed consensus that Quality of Life and Readmission are important and feasible to measure, but we also make recommendations about how to measure them to improve heterogeneity of outcome reporting.
There were some unexpected exclusions in the core outcome set, for example mental health symptoms and treatment adherence were quite frequently used in past research [1], but not included in the core outcome set. At the beginning of the paper we described the recent Kings Fund report that suggested generating agreement amongst healthcare professionals, service users, policy makers and researchers is a difficult but imperative task [23]. We found this reflected here and feel that the small four-item core outcome set represents the only outcomes that are unanimously agreed upon as essential, despite so many outcomes being of upmost importance to service-users and families. This research has further highlighted the importance of shared decision making and service-user and family involvement to all stakeholder groups [40].This consensus study indicates a desire from all groups to monitor levels of service-users satisfaction and involvement in the process. Whilst such outcomes, were excluded in later stages, primarily on the basis of being process variables/outcomes, it does not reduce their prospective importance in discharge interventions or provision of care at discharge. The five most agreed upon elements of service-user involvement and satisfaction in discharge were: Service user involvement in discharge planning; Service user understanding of discharge plan; Service user involvement in decision making; Service user satisfaction with information provision at discharge; Service user knowledge of how to access community support. We recommend that future policy makers and healthcare management consider incorporating all of these elements into local level initiatives as overriding principles of care rather than interventions/measures, if they are ever missing from current provision. Furthermore, work highlighting the importance of involving service users in mental health care planning is beginning to emerge, along with measures of such activity. Therefore, we suggest that future research wherever possible should include a service user reported outcome measure of involvement alongside the 4-item core outcome set and any other chosen measures. This could be measured in an existing instrument of service-user involvement care planning in mental health, such as the EQUIP PROM (Patient Reported Outcome Measure) [40], or use the 6 outcomes presented above to develop a self-reported Likert measure of service user involvement specifically in discharge planning, as these 6 items are developed from synthesis of academic literature, qualitative questionnaires and met criteria for consensus amongst experts, so from a psychometric perspective would arguably meet initial face and content validity criteria [41], see additional file 6.
The difficulties of developing a mental health core outcome set was further epitomised when applied to care transitions: a service-level (rather than specific clinical population) multiagency, multi-stage, complex period of the care pathway [3,24]. Generating a set of meaningful applicable outcomes that span primary and secondary care, across multiple physical locations, that are relevant for every service user was imperative. For example, a great deal of past literature focuses on housing interventions [42][43][44], and whilst housing is a significant safety issue at discharge, it's not necessarily relevant to all service users. This multi-agency, multi-morbidity complexity was arguably one factor that resulted in the small set of generic outcomes, that arguably differs from narrowly defined clinical core outcome set reported in past literature of many more outcomes [45,46].
This study had several strengths. Our method is based on recommendations from an international panel of experts [25]. Inclusion of service-users and health-care professionals at every stage ensured that outcomes in the final core set embody shared priorities. The comprehensive and laborious long-list process ensured all potential outcomes were considered in the course of the consensus process. However, there were some limitations to our study. The research was only conducted in English, due to budgetary constraints, although our online rounds included participants from 12 countries. The attrition rate between the first questionnaire and the first Delphi round could be considered relatively high, however there was little attrition between consensus rounds. The group with the highest level of attrition after round one were service users and families and their replacements in round two were primarily researchers; which could have impacted on the results in round two. Furthermore, in many consensus meetings additional outcomes are often added, the method infrequently serves as means of reducing the number of outcomes included in the preliminary core outcome set from the Delphi [30]. However, in our case we found that the group did not agree with many of the outcomes and it was reduced to a very small COS of 4 items. This is beneficial in some ways, as we hope it is easier for researchers to operationalise a four-item core outcome set. However, to ensure that the spectrum of perceptions of safety by each group is addressed we recommend using the COS alongside an additional service user reported outcome measure (PROM) that encompasses the other outcomes that had high consensus in the Delphi.
We have made preliminary recommendations of measures and time markers for the four core outcomes, however we recommend that future research tests the feasibility and effectiveness of these measures, with an expectation that they may be revised. The recommendations may be UK-centric, given that only 23% of the panel were international researchers and 28 day readmission rate is in line with UK policy [38]. We therefore invite international researchers to test or comment on the feasibility of this core outcome set outside of the UK. There was also no concrete consensus on any measure or time-marker, therefore we have made recommendations informed by the predominant opinions of experts to ensure that the core outcome set can be operationalised immediately. We acknowledge that it will often be necessary for outcomes and measures to be adapted or augmented in future research to include additional or complementary items, but feel the necessity for homogeneity in outcome reporting is imminent.
The use of outcomes in mental health research and service, is becoming more contested in terms of what is meaningful and effective, it could be argued that core outcomes sets are less applicable to mental health populations than general health populations, given the complexity of mental health problems and the subjectivity of measuring it. However, as core outcomes sets are relatively uncommon in mental health, we believe (similar to other clinical populations) a small, agreed, feasible set of core outcomes will facilitate between study comparability and advancement in evidence collection [21,25].

Future Directions
Development of this core outcome set involved participation of stakeholders from 12 different countries; (primarily researchers) however, we recommend that further work should be undertaken to validate this core outcome set more widely, particularly in non-English speaking populations. The two of the final four outcomes and many of the preliminary 15 outcomes to emerge from the Delphi, are not necessarily specific to mental health care transitions. Some outcomes are comparable to a similar core outcome set for care transitions of adolescents and young adults with special healthcare needs [47]. Future research may consider a 'transitions of care' core outcome set, to reduce the number of similar core outcome sets.

Conclusion
The four outcomes included in our outcome set represent the consensus opinion of a group of service-users, health-care professionals, and international researchers and addresses an unmet necessity: assisting researchers in the design, implementation and reporting of interventions that aim to improve discharge from acute mental health settings. Ultimately, application of this core outcome set will enhance the relevance of future interventions to health-care professionals, the research community and service-users. If used, the core outcome set could provide more evidenced-based interventions, underpinned by theory of change outlining the relationships between the component of the intervention and the outcome it should improve [1,48]; which should increase service-user safety at this distressing time period.

Patients' consent and permission to publish
All participants gave informed consent to be involved in the study ahead of data collection.

Author Contributions
NT conceived the design of the study; conducted the literature search, meta-synthesis and Delphi; analysed the data; and drafted the majority of the manuscript. JW and NW contributed equally and provided oversight of the study design, analysed and synthesised the data and contributed significantly to the drafting of the manuscript. AG analysed and synthesised data, provided an expert by lived experience opinion on decisions made in regards to wording, study deisgn, PPI involvement, also an contributed adaptations to the manuscript.  Might be easier to gather administrative data, but worth cross-checking to improve quality of information Self-reported questionnaire 1 Other-carer interview 2 Total 38 Conclusion: A minimum recommendation of using retrospective review of administrative data. This will allow for various studies with diverse time and financial limits to use the COS. However, we also advise that this should supplemented either by cross-checking with service-users, case managers or carers, where possible to improve quality of data.

Time Markers
Time Conclusion: The minimum recommendation to record readmission within 28 days. 26 of the 43 participants recommended a measure of around a month (1 month, 30 days or 28 days) with 28 being this most popular. Those looking for higher quality or more comprehensive data may also like to record 7 days, 3 months and 6 months as these were also popular recommendations. I think the use of tools should be complemented with interviews with service users and carers. Total 33 ReQoL combined (10+20) 12

Outcome 2: Quality of Life
Conclusion: We recommend that researchers use ReQoL-10. This was the most voted for instrument. If we also combine the scores with those who voted for ReQoL 20, a large proportion of the group suggested this outcome. As this is a quality of life measure specific to mental health recovery we feel this is most appropriate.

Conclusion:
We recommend a minimum measure of QoL at one month post-discharge in RCTs. This is in keeping with the readmission time frames, making a more comprehensive and accessible core outcome set. Those using within-participant measures of quality of life may like to also measure a pre-discharge baseline. Those looking for more thorough assessment of QoL may like to also measure at 7 days post-discharge and 3 months, as these were also highly recommended time markers. The minimum recommendation is one month post-discharge. For consistency with other outcomes we recommend measure at one month. 7 days and 3 months are also highly recommended, so we would recommend these for more thorough research. To develop a core set of outcomes to be used in all future studies into discharge from acute mental health services to increase homogeneity of outcome reporting.

Design
We used a cross-sectional online survey with qualitative responses to derive a comprehensive list of outcomes, followed by two online Delphi rounds and a face-to-face consensus meeting.

Setting
The setting the core outcome set applies to is acute adult mental health.

Participants
Participants were from recruited from five stakeholder groups: service-users, families and carers, researchers, healthcare professionals and policy makers.

Interventions
The core outcome set is intended for all interventions that aim to improve discharge from acute mental health services to the community.

Results
Ninety-three participants in total completed the questionnaire, 69 in Delphi round 1, and 68 in round 2, with relatively even representation of groups. Eleven participants attended the consensus meeting. Service-users, healthcare professionals, researchers, carers/families and end-users of research agreed on a four-item core outcome set: Readmission, Suicide completed, Service-user reported psychological distress and Quality of life.

Conclusion
Implementation of the core outcome set in future trials research will provide a framework to achieve standardisation, facilitate selection of outcome measures, allow between-study comparisons, and ultimately enhance the relevance of trial or research findings to healthcare professionals, researchers, policy makers and service users.

Strengths and Limitations
 This is the first initiative to reduce heterogeneity in outcome reporting for interventions that improve discharge from acute mental health services.  We achieved a high level of consensus amongst 69 service users, families/carers, healthcare professionals, researchers and policy makers.  Although the stakeholder group included international researchers, service users and healthcare professionals were from the UK

Competing Interests
The authors report no competing interests.

Background
Care transitions (when patient care is transferred from one team, department or organisation to another) are widely recognised as a vulnerable and high-risk stage in the care pathway [1][2][3]. Safety issues may be intensified in acute mental health services, where care transitions are described as chaotic [3]. For example, suicide risk increases post-discharge from acute mental health services [4,5]. A growing body of research describes these risks either directly in terms of identified 'safety' events or indirectly in terms of broader 'problems', including for example treatment non-adherence, inappropriate readmissions, increased risk of self-injury or suicide attempts [3,[6][7][8].
Internationally, researchers have attempted to find solutions to the problems or threats to safety associated with discharge from acute mental health services by developing interventions that aim to improve different aspects of discharge planning, transitions, continuity of care, and follow-up care. There are various types of discharge intervention presented in the literature [9]. Some interventions aim to improve discharge by introducing new roles, for example a discharge co-ordinator, that co-ordinates care and provides and single point of contact to help navigate the complex system [10]. Others focus on increasing contact between clinical staff and service users, for example using letters, videoconferencing or telephone follow-up [11][12][13]. Many 'successful' interventions in reducing readmission, bridged the boundaries between ward and community by providing types of ward-based care in the community [14,15] or where community teams lead discharge planning on the wards [16], with a focus on the early development of therapeutic relationships. Other successful interventions aim to solve a particular problem for a smaller group of acute service users, for example those at risk of homelessness, by providing financial and social support [17,18]. There has been little attempt to compare these diverse interventions. Existing reviews have included either a narrow range of studies addressing a single outcome or focus on a specific time frame in an attempt to synthesise results [8,19]. Comparison and meta-synthesis of effectiveness of interventions has reported limited success. Across the papers included in our systematic review and those by other researchers [1,20], variation in the outcomes reported is substantial. This limits between study comparability and delays advancement in evidence collection. Furthermore, outcomes in these trials were not necessarily representative of the measures that service users would consider important at discharge. Both matters can potentially be addressed with the development of a 'core outcome set', defined as "an agreed, standardised collection of outcomes which should be measured and reported, as a minimum, in all trials for a specific clinical area" [21].
The development and use of 'core outcome sets' has been endorsed as a means to reduce outcome heterogeneity in research, and to increase the relevance of research through the involvement of key stakeholders in its development [22]. There is an emerging body of literature highlighting the difficulties of defining and assessing outcomes in a mental health population [23]. There is also evidence of a lack of agreement amongst key groups about what should be measured and in what capacity and an evident tension between the population health perspective and provision of individualised care [19,23]. One aforementioned previous review identified the need for consensus on outcome definitions in discharge planning interventions [19], similarly a recent Kings Fund report suggested broader consensus upon the outcomes that matter is imperative for advancement [23]. Therefore, generating agreement amongst healthcare professionals, service users, policy makers and researchers is a difficult but imperative task, to enable the useful direction of healthcare services [23]. The difficulties are further exemplified when applied to care transitions, a multi-agency, multi-stage, complex period of the care pathway [3,24]. This paper outlines the development of a core outcome set for research of interventions to improve discharge from acute mental health wards to the community.

Study overview
The scope of the core outcome set was defined according to the criteria recommended by Core Outcome Measures in Effectiveness Trials (COMET) [25]. The health condition was functional conditions (mental disorders other than dementia, and includes severe mental illness such as schizophrenia). The population was adults aged 18-65, the intervention was any interventions that aimed to improve discharge from an acute mental health setting to the community. The core outcome set was developed using four stages, including service users and healthcare professionals at each stage: (1) a long list of outcomes was generated through a systematic review [1] and qualitative survey; (2) the resulting long outcome list was used to populate an online Delphi process (2 rounds); and (3) the results of the Delphi survey were appraised at a consensus meeting and the final core outcome set was established. After the development of the core outcome set a final stage (4), engaged stakeholders to make recommendations for the measurement of the core outcomes. The process included a series of core research team meetings at every stage, the team comprised of a researcher and core outcome set developer, an associate professor in mental health and mental health nurse, a researcher and expert by lived experience of acute services and an expert in patient safety. Participants did not fit into distinct homogeneous groups, for example mental health professionals were sometimes also past service users or family members of service users. Similarly researchers had personal experience of inpatient mental health services. Therefore, wherever possible we considered the group as whole and tried not to compare categories.

Participants
Participants were recruited in a number of ways in December 2018 to January 2019. Academic researchers were recruited if their research had been included in our systematic review or if they were known researchers in the field identified by the team. End-users of research (policy makers, NGOs, NHS management, commissioners, advocates etc.) were recruited via searching for publicly available contact details or using our team's professional networks or social media. Service users and healthcare professionals were recruited through social media. Twitter was nominated as the primary platform for recruitment due to its ability to reach into the specific communities of interest we required: mental health professionals, service users and families/carers. Using social media has been reported as a cost-effective and efficient way to recruit those from potentially stigmatised groups [26]. Further, the peer network structures of social media platforms enable users to recruit other users through sharing links within their networks.
The same participant group was used throughout the iterative research process, therefore, in order to reduce attrition, those who dropped out in early rounds were invited to re-join the panel in subsequent rounds. Participants were recruited for the consensus meeting during the Delphi, UK participants were asked to indicate whether they would be interested in a face-toface meeting. We invited a random sample of interested participants to attend, that ensured representative of the stakeholder groups. If a participant declined the invite a similarly matched participant was invited from the Delphi panel principally, or the teams wider network.

Stage 1: Gathering information
In addition to the outcomes extracted from the systematic review [1], outcomes of importance to each stakeholder group were identified through qualitative surveys. For the main body of the questionnaire, we used open questions that were developed to elicit potential additional outcomes. The questions were loosely modelled on questions developed for a large scale outcome generation study for a depression core outcome set which were developed with service users and healthcare professionals [27]. The question format was mirrored, but adapted for a mental health discharge theme. The views of a PPI (patient and public involvement), group sought to confirm appropriateness of questions and instructions (n=5).
After reading a participant information sheet and giving informed consent (by ticking a box), participants selected their stakeholder group (s) and watched a video that describes core outcome sets to non-experts. All participants were then presented with four open-ended questions relating to safe and effective discharge (see additional file 1). Participants were later presented with 3-5 questions specifically developed for their stakeholder group, additional file 1 outlines all of the questions. If a participant were a member of more than one group, they answered questions relevant to multiple groups. Participants also answered a number of demographic questions: years of experience, country of residence, area of UK (if applicable), gender, age and email address for follow-up. The round was open for 6 weeks beginning December 10 th 2018. Qualitative data was coded to identify outcomes and thematically synthesised [28]. This involved line-by-line coding of text and development of descriptive themes, the final stage involved generating analytical themes, which were converted into potential outcomes where applicable. Outcomes two were identified both indirectly, by extrapolating from service users' experiences (e.g. What would make discharge from an acute mental health ward safe in your opinion?), and directly, by asking specifically about outcomes (e.g. Can you think of any important outcomes to measure in research assessing discharge interventions?).
Outcomes from the systematic review [1] and qualitative surveys were combined to generate a long list of outcomes. This list, along with relevant quotes from the qualitative data, was discussed by the core research team in a structured meeting. Each outcome was considered in turn and each member had the opportunity to present arguments for or against inclusion. For each outcome, the group decided whether it should be a stand-alone outcome, combined with other codes of a similar thematic nature or removed from the process due to being of limited importance for a core outcome set. For example, we agreed to merge closely related items (e.g. family relations and quality of interpersonal relationships) and to exclude outcomes considered to be of limited importance (e.g., specific to a specialised area of care: Autistic life; or intervention Antipsychotic Politherapy). Unless there was a unanimous decision to merge or remove an outcome, it remained as a stand-alone outcome. The group decisions about each outcome are documented in additional file 2.

Stage 2: Delphi survey
The Delphi technique is a research method aimed at generating consensus. It solicits opinions from stakeholders groups in an iterative process of answering questions. After each round the responses are summarised and redistributed for discussion in the next round. We chose to have two rounds of Delphi in this study. The final outcome list that was decided upon after the group discussion in stage 1 was used to develop the first Delphi questionnaire. Any outcomes without consensus after the first round, were re-presented in round 2. The outcome list and instructions for the questionnaires were reviewed for face validity, understanding, and acceptability by a PPI group (n=5) and modified according to feedback.  [29]. In each round, participants were asked whether the items should become part of a core outcome set. A 7-point Likert scale was used, described as: Strongly Agree (1), Agree (2), Slightly Agree (3), Neither Agree nor disagree (4), Slightly Disagree (5), Disagree (6), Strongly Disagree (7). There is no definitive research indicating the optimal number of points to have on a Likert scale but scales between 5 and 9 points have been suggested as having the best reliability, so we chose a 7 point scale [30]. There was a free-text comments box and participants were encouraged to provide comments that would be fed-back anonymously to the group. Participants could suggest additional outcomes at the end of round 1, which were reviewed by the core research team. Any outcome not already represented was added to round 2. In round 2 median group scores for each outcome and anonymous comments for and against from the previous round were presented and participants were asked to reflect on the information presented and score each outcome again. The percentage of participant agreement with each outcome on a scale of 1-7 was calculated from the scores obtained during round 1 and again in round 2.
Literature suggests that consensus levels should be set a priori at a minimum of 70 percent [25,31]. We unanimously chose a 75% consensus level, slightly higher than the minimum to increase sensitivity, but to still allow for a varied pool of applicable outcomes given the tension in the literature around disagreement between service-user, health professional and policy-makers opinions of mental health outcomes [23]. Consensus criteria were defined a priori: outcomes scored as Agree or Strongly Agree (6-7) by 75% or more of the group reached consensus for inclusion and were included in the provisional core outcome set. Outcomes scored as Disagree or Strongly Disagree (1-2) by 75% or more were defined as having reached consensus for exclusion and were excluded. Outcomes not fulfilling criteria for consensus inclusion or exclusion were defined as not having reached consensus and were re-presented in round 2.
As no outcomes met the original criteria for having reached consensus for exclusion after round 1, it was agreed by the research team to redefine the criteria for having reached consensus for exclusion if 50% or less of participants scored the item as Strongly Agree or Agree (6-7). Reducing exclusion criteria after round 1 has been used effectively in past core outcome set research [30].

Stage 3: Consensus meeting
The results of the Delphi survey were presented at a consensus meeting. The main goal of the consensus meeting was to decide which items will be included in the final core outcome set. This was chaired by an independent researcher with expertise in consensus methodology, and who was not a member of the core research team. Participants were sampled to achieve a balanced representation of service users, health-care professionals, researchers and end-users of research. We aimed to have a small representative group of between 9 and 12 to enable meaningful small group discussions, similar to consensus meetings chaired by the facilitator in other fields [32,33]. International participation was restricted because of budgetary constraints.
The format of the consensus meeting comprised of a) a short overview of the study and b) a summary of the Delphi results sorted by stakeholder group, beginning with the outcomes that met consensus [34]. Outcomes identified in round 1 and 2 of the Delphi as having reached consensus for inclusion were presented first. Participants were asked if there were any fundamental reasons why these should not be included in the core outcome set. Divergent views were actively sought and the chair ensured everyone had opportunity to participate in discussions before voting commenced. Outcomes from the preliminary core outcome set were discussed in terms of feasibility and voted upon. Voting was conducted anonymously using cards in an envelope with bivariate response options (include/exclude). Voting and consensus criteria followed the same format as in the Delphi (75% for inclusion). Results were presented after the voting of all outcomes had finished. Outcomes deemed to be having reached consensus for exclusion or with no consensus in the Delphi were reviewed and participants were asked if there were any fundamental reasons why these should be included

Patient and Public Involvement
Five patient representatives worked with researchers to develop the online questionnaires. Patients were represented alongside professionals and researchers in the consensus panel.
One member of the research team (and co-author) is an expert by lived experience and was involved in all design and analysis decisions.

Ethics and registration
Our findings are reported in line with the Core Outcome Set-Standards for Reporting (COS-STAR) guidance [34]. The study was prospectively registered with the COMET initiative. The study was approved by the University of Nottingham Business School ethics committee.

Role of the funding source
The funding source was not involved in study design; in the collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the paper for publication

Stage 1: Information gathering
Our systematic review has been described in detail elsewhere [1]. In summary 69 outcome categories were identified from 45 studies. Ninety-three participants in total, from 12 countries completed the information gathering questionnaire. However, as aforementioned, many identified with more than one stakeholder group, therefore we do not have absolute homogenous stakeholder group numbers, 27 identified as service users, 17 family/carers, 39 health-care professionals, 15 end-user of research and 37 researchers. Additional file 3 presents participants demographics. Qualitative questionnaires revealed an additional 45 outcomes that were not identified in the literature (for example, outcomes concerning involvement in discharge planning, see additional file 2). After discussion within the research team, 82 standardised outcome terms were taken forward into the Delphi process; 19 outcomes were combined/collapsed and 13 were removed, see additional file 2.

Stage 2: Delphi process
Sixty-nine participants completed round 1 of the Delphi (22 service users, families and carers, 26 researchers and 21 healthcare professionals and decision makers) and 68 participants completed round 2 (30 researchers, 18 service users and families and 20 healthcare professionals and decision makers). Whilst 5 participants dropped out after round 1, 4 participants joined the panel in round 2 (this individuals participated the qualitative questionnaire but not round 1). There was a 1.4% attrition between round 1 and 2 of the Delphi. Seven additional outcomes were proposed by participants during round 1, of which  1). Twenty outcomes met the revised criteria for having reached consensus for exclusion (50% or less of participants agreed/strongly agreed with that outcome). Forty-eight outcomes did not meet consensus criteria for inclusion or exclusion and were re-presented to the group in round 2. Therefore, 50 outcomes were presented in round 2, only one outcome met the criteria for consensus after this round: meaningful activity. No outcomes met criteria for exclusion and 49 did not meet consensus. Additional file 4 shows consensus levels for each outcome in each round.

Stage 3: Consensus Meeting
Eleven participants attended the consensus meeting, as in previous rounds these categories were not exclusive, six participants were researchers, three identified as service users, three as healthcare professionals and three end-users of research, see table 3. Table 4 shows the quantitative results of the meeting.
The preliminary 15-item core outcome set was considered individually and discussions indicated that many of the outcomes were elements of an ideal discharge, and process outcomes/variables, but probably not measurable outcomes that should be included in a core outcome set. After these discussions and independent and anonymous voting, five items no longer met consensus criteria for inclusion. First, 'service user involvement' in discharge planning, and the associated items 'Service user understanding of discharge plan', 'Service user involvement in decision making', 'Service user satisfaction with information provision at discharge'; and 'Service user knowledge of how to access community support'. There was a discussion that these are very important elements of a successful discharge, but not core outcomes due to issues surrounding validity and meaning. In the discussion of this paper, we present these five outcomes as a potential self-reported measure of service user satisfaction and involvement, which could be used in addition to the core outcome set.
'Mental health and illness' was initially close to consensus with 73% consensus to include, however those that chose to exclude found it to be too vague, and articulated that they were most interested in measuring acute psychological distress, rather than mental health and illness. The service user representatives in the group interpreted "recovery" to mean a complete amelioration of symptoms and even when in "recovery" individuals described continuing to experience distress and difficulties with their mental health. We chose to therefore separate the broader mental health and illness outcome into self-reported psychological distress and clinician reported mental health. The granular outcome of selfreported psychological distress resulted in 100% consensus to include. On the contrary clinician reported mental health did not meet consensus criteria (45%). Similar discussions happened around the recurrence (relapse) outcome, whereby its inclusion in a core outcome set, would ultimately necessitate buy-in to criteria model, which suggested that mental health problems could and should be completely resolved.
Discussions around the 'suicide attempted' outcome indicated that participants felt that suicide attempts or self-harm had diverse motivations and definitions and they discussed the issues of delineating the boundaries of self-harm and suicide attempts and how this is documented. After the consensus meeting this outcome no-longer meet consensus criteria to include. Discussions surrounding personal recovery, functioning and meaningful activity indicated that participants considered these outcomes too vague and subjective to be a component of a core outcome set. There was consensus to exclude meaningful activity and recovery, and no consensus to include personal recovery. There was consensus to exclude discharge to appropriate accommodation, discussion indicated this was primarily because this spanned the health and social care boundaries and may not be applicable to every intervention.
On completion of the meeting, only four outcomes met consensus criteria for inclusion, see Table 2. A core outcome set of four was agreed, participants agreed that the following should be included: readmission, quality of life, suicide completed and service user reported psychological distress. Readmission was the most frequently used outcome in past research, and despite limitations, participants felt it was one of the only proxy measures of appropriate discharge. Quality of life and psychological distress were considered important ways of quantitatively assessing the psycho-social elements of discharge; which are of primary importance. Suicide completed was considered rare but imperative data to capture given the research highlighting the relationship between acute mental health discharge and suicide highlighted by a growing body of literature [5,35]. Figure 1 shows the process undertaken to reach the core outcome set.

Discussion
This study provides the first international consensus on outcomes for intervention studies concerning discharge from an acute adult mental health inpatient setting. We could not identify any other published core outcome sets for interventions concerning discharge from acute mental health services. Moreover, there are very few core outcome sets for mental health, despite recommendations for consensus in the literature [20,23]. All the included outcomes were agreed upon by more than 75% of a group of relatively equally-represented service-users and family/carers, health-care professionals, researchers and end-user of research using consensus methods. We recommend that all future research studies evaluating interventions for discharge from acute adult mental health settings use this core outcome set as a framework for outcome selection, to compliment, rather than replace any other outcomes that are relevant to their research question. As discharge from acute services is a particularly challenging period for those experiencing mental health problems [3,35], it's important to understand what interventions work and more specifically which elements of an intervention improve which particular outcomes. This core outcome set provides a framework for between-study comparison, ultimately enabling researchers to articulate the theory of change that underpins interventions.
In our systematic review [1], we identified 22 studies that reported readmission rates as an outcome, yet almost all of them captured this in different ways: some used self-report data, some clinical case notes or some retrospective administrative data, others used case manager's reports. In addition the time markers were variable, some used country specific time markers in line with policy such as 28 days in the UK [36], whilst others chose a series of time markers such as within 1 month, 3 month and 6 months, but the time markers were rarely directly comparable. Similarly, six studies measured quality of life but, only two used the same measurement instrument (Lehman's Quality of Life) [16,37]. In the current study, we have developed consensus that Quality of Life and Readmission are important and feasible to measure, but we also make recommendations about how to measure them to improve heterogeneity of outcome reporting.
There were some unexpected exclusions in the core outcome set, for example mental health symptoms and treatment adherence were quite frequently used in past research [1], but not included in the core outcome set. At the beginning of the paper we described the recent Kings Fund report that suggested generating agreement amongst healthcare professionals, service users, policy makers and researchers is a difficult but imperative task [23]. We also found this, and feel that the small four-item core outcome set represents the only outcomes that are unanimously agreed upon, despite so many outcomes being of upmost importance to serviceusers and families. This research has further highlighted the importance of shared decision making and service-user and family involvement to all stakeholder groups [38].This consensus study indicates a desire from all groups to monitor levels of service-users satisfaction and involvement in the process. Whilst such outcomes, were excluded in later stages, primarily on the basis of being process variables/outcomes, it does not reduce their prospective importance in discharge interventions or provision of care at discharge. The five most agreed upon elements of service-user involvement and satisfaction in discharge were: Service user involvement in discharge planning; Service user understanding of discharge plan; Service user involvement in decision making; Service user satisfaction with information provision at discharge; Service user knowledge of how to access community support. We recommend that future policy makers and healthcare management consider incorporating all of these elements into local level initiatives as overriding principles of care rather than interventions/measures, if they are ever missing from current provision. Furthermore, work highlighting the importance of involving service users in mental health care planning is beginning to emerge, along with measures of such activity. Therefore, we suggest that future research wherever possible should include a service user reported outcome measure of involvement alongside the 4-item core outcome set and any other chosen measures. This could be measured in an existing instrument of service-user involvement care planning in mental health, such as the EQUIP PROM (Patient Reported Outcome Measure) [38], or use the 6 outcomes presented above to develop a self-reported Likert measure of service user involvement specifically in discharge planning, as these 6 items are developed from synthesis of academic literature, qualitative questionnaires and met criteria for consensus amongst experts, so from a psychometric perspective would arguably meet initial face and content validity criteria [39], see additional file 5.
The difficulties of developing a mental health core outcome set was further epitomised when applied to care transitions: a service-level (rather than specific clinical population) multiagency, multi-stage, complex period of the care pathway [3,24]. Generating a set of meaningful applicable outcomes that span primary and secondary care, across multiple physical locations, that are relevant for every service user was imperative. For example, a great deal of past literature focuses on housing interventions [40][41][42], and whilst housing is a significant safety issue at discharge, it's not necessarily relevant to all service users. This multi-agency, multi-morbidity complexity was arguably one factor that resulted in the small set of generic outcomes, that arguably differs from narrowly defined clinical core outcome set reported in past literature of many more outcomes [43,44].
This study had several strengths. Our method is based on recommendations from an international panel of experts [25]. Inclusion of service-users and health-care professionals at every stage ensured that outcomes in the final core set embody shared priorities. The comprehensive and laborious long-list process ensured all potential outcomes were considered in the course of the consensus process. However, there were some limitations to our study. The research was only conducted in English, due to budgetary constraints, although our online rounds included participants from 12 countries. Furthermore, in many consensus meetings additional outcomes are often added, the method infrequently serves as means of reducing the number of outcomes included in the preliminary core outcome set from the Delphi [30]. However, in our case we found that the group did not agree with many of the outcomes and it was reduced to a very small COS of 4 items. This is beneficial in some ways, as we hope it is easier for researchers to operationalise a four-item core outcome set. However, to ensure that the spectrum of perceptions of safety by each group is addressed we recommend using the COS alongside an additional service user reported outcome measure (PROM) that encompasses the other outcomes that had high consensus in the Delphi.
The use of outcomes in mental health research and service, is becoming more contested in terms of what is meaningful and effective, it could be argued that core outcomes sets are less applicable to mental health populations than general health populations, given the complexity of mental health problems and the subjectivity of measuring it. However, as core outcomes sets are relatively uncommon in mental health, we believe (similar to other clinical populations) a small, agreed, feasible set of core outcomes will facilitate between study comparability and advancement in evidence collection [21,25].

Future Directions
Development of this core outcome set involved participation of stakeholders from 12 different countries; (primarily researchers) however, we recommend that further work should be undertaken to validate this core outcome set more widely, particularly in non-English speaking populations. The two of the final four outcomes and many of the preliminary 15 outcomes to emerge from the Delphi, are not necessarily specific to mental health care transitions. Some outcomes are comparable to a similar core outcome set for care transitions of adolescents and young adults with special healthcare needs [45]. Future research may consider a 'transitions of care' core outcome set, to reduce the number of similar core outcome sets.
Another key priority to make this core outcome set operationalised is to agree upon measurement criteria using the COSMIN guidelines [46]. We conducted some preliminary questionnaires with the Delphi panel to produce preliminary measurement recommendations, however there was very little agreement amongst panellists (see additional file 6). Another key priority to make this core outcome set operationalised is to agree upon measurement criteria using the COSMIN guidelines [46]. We conducted some preliminary questionnaires with the Delphi panel to produce preliminary measurement recommendations, however there was very little agreement amongst panellists (see additional file 6). The recommended measures by the panel were Kessler Psychological Distress (K10) and Recovery Quality of Life (ReQoL) within one month of discharge [47,48]. Readmission and suicide completed rates were recommended to be captured within 28 days of discharge using retrospective review of administrative data. However, these are only preliminary recommendations and we highly recommend a future study following COSMIN guidelines.

Conclusion
The four outcomes included in our outcome set represent the consensus opinion of a group of service-users, health-care professionals, and international researchers and addresses an unmet necessity: assisting researchers in the design, implementation and reporting of interventions that aim to improve discharge from acute mental health settings. Ultimately, application of this core outcome set will enhance the relevance of future interventions to health-care professionals, the research community and service-users. If used, the core outcome set could provide more evidenced-based interventions, underpinned by theory of change outlining the relationships between the component of the intervention and the outcome it should improve [1,49]; which should increase service-user safety at this distressing time period.

Patients' consent and permission to publish
All participants gave informed consent to be involved in the study ahead of data collection. NT conceived the design of the study; conducted the literature search, meta-synthesis and Delphi; analysed the data; and drafted the majority of the manuscript. JW and NW contributed equally and provided oversight of the study design, analysed and synthesised the data and contributed significantly to the drafting of the manuscript. AG analysed and synthesised data, provided an expert by lived experience opinion on decisions made in regards to wording, study design, PPI involvement, and also contributed adaptations to the manuscript.

Conflicts of Interest
The authors report no conflicts of interest

Ethics Committee Approval
This research received a favourable ethics opinion from Nottingham University Business School Ethics Committee.

Role of Funding Source
This research is funded by the NIHR Greater Manchester Patient Safety Translational Research Centre, they had no role in the writing of manuscript or decision to submit.

Data Statement
Data is available upon reasonable request

Acknowledgements
We would like to thank the patient and public involvement group for their help in designing the questionnaires.

Methods
After the core outcome set was agreed in the consensus meeting, we invited all participants from the earlier stages of the project to recommend measures and time markers in a final online questionnaire. Participants were invited to participate if they had been involved in any of the previous online rounds. The invitation made it clear that the questionnaire is most relevant to researchers, but that other groups with an opinion or interest are welcome to contribute. This was due to the specific knowledge of instruments required to complete this round.
In this questionnaire participants were presented with the four core outcomes. For each core outcome they were presented with any measures used to assess that outcome in our systematic review studies [1] and any additional measures that had been recommended to the team during the process. Participants were asked to choose the one most appropriate, (don't know, other, new instrument, no instrument were also options). A second question also asked which time markers would be recommended, with options to select all applicable. These options were also developed based on time markers used in the systematic review [1].

Results
Forty-three of the 93 invited participants responded (15 service users, 8 family members/carers, 23 researchers, 10 healthcare professionals, 3 end users of research), although as in previous rounds these were not distinct categories. Fifty-three percent of the respondents were researchers, this was expected as in the email we suggested that this stage may be more meaningful or of interest to this group, but as a team we chose not to exclude other groups with opinions on measurement instruments. Twenty-three percent of participants were international  Table 4 shows the preliminary minimum measure recommendations and time markers, additional file 5 shows the results upon which the recommendations were based. One month post-discharge Quality of Life ReQoL-10 One month post-discharge

Readmission
A minimum recommendation of using retrospective review of administrative data for readmissions within a defined time period, the most agreed was 28 days. Participants indicated that routine data collection might cover slightly different time periods. Twenty-six of the 43 participants recommended a measure of around a month (1 month, 30 days or 28 days) with 28 days being this most popular. However, they also advise that this should be supplemented either by cross-checking with service-users, case managers or carers, where possible to improve quality of data. Those looking for more comprehensive data may also like to record 7 days, 3 months and 6 months as these were also popular recommendations.

Quality of Life
The participants recommended that researchers use the Recovering Quality of Life (ReQoL-10) at one month post-discharge [35]. This was the most recommended instrument by the group. However, many participants also voted for ReQoL 20, a large proportion of the group suggested this outcome. As this is a quality of life measure specific to mental health recovery they felt this is most appropriate. The one month time marker is in-keeping with the other COS time frames, making a more comprehensive and accessible core outcome set. Those using within-participant measures of quality of life may like to also measure a pre-discharge baseline. Researchers looking for more thorough assessment of quality of life may like to also measure at 7 days post-discharge and 3 months, as these were also highly recommended time markers or use the ReQoL 20 and report both scores for comparability.

Suicide Completed
The participants recommended retrospective review of administrative data, for suicide completed within 28 days of discharge. Retrospective review is in line with other outcomes and was marginally the highest suggestion. We chose within 28 days for consistency with readmission data. Researchers looking for more comprehensive data may want to use 7 days and 3 months as these were highly recommended also. They may also want to cross-check this information against other sources (carers/case managers) to ensure it is correct and reported, particularly as participants mentioned the impact of incorrect coroner's reports on such data.

Psychological Distress
The participants recommended Kessler Psychological Distress (K10) one month postdischarge [36] . For consistency with other outcomes we recommend measure at one month. Seven days and 3 months are also highly recommended, so we would recommend these for research that is more robust. Although there were very few votes for instruments for psychological distress and qualitative comments revealed that participants felt this is not measurable. The same amount of people who voted for K10 also voted for interviews or other measures and a similar amount recommended the development of a new measure, CORE-10 was similarly close [37]. Whilst we make this recommendation, we also suggest that future researchers may look to develop something specific for Psychological Distress in this core outcome set. Interviews would not effectively facilitate the between study comparison, the key purpose of a COS. Conclusion: A minimum recommendation of using retrospective review of administrative data. This will allow for various studies with diverse time and financial limits to use the COS. However, we also advise that this should supplemented either by cross-checking with service-users, case managers or carers, where possible to improve quality of data.

Time Markers
Time also measure a pre-discharge baseline. Those looking for more thorough assessment of QoL may like to also measure at 7 days post-discharge and 3 months, as these were also highly recommended time markers.

Objective
To develop a core set of outcomes to be used in all future studies into discharge from acute mental health services to increase homogeneity of outcome reporting.

Design
We used a cross-sectional online survey with qualitative responses to derive a comprehensive list of outcomes, followed by two online Delphi rounds and a face-to-face consensus meeting.

Setting
The setting the core outcome set applies to is acute adult mental health.

Participants
Participants were from recruited from five stakeholder groups: service-users, families and carers, researchers, healthcare professionals and policy makers.

Interventions
The core outcome set is intended for all interventions that aim to improve discharge from acute mental health services to the community.

Results
Ninety-three participants in total completed the questionnaire, 69 in Delphi round 1, and 68 in round 2, with relatively even representation of groups. Eleven participants attended the consensus meeting. Service-users, healthcare professionals, researchers, carers/families and end-users of research agreed on a four-item core outcome set: Readmission, Suicide completed, Service-user reported psychological distress and Quality of life.

Conclusion
Implementation of the core outcome set in future trials research will provide a framework to achieve standardisation, facilitate selection of outcome measures, allow between-study comparisons, and ultimately enhance the relevance of trial or research findings to healthcare professionals, researchers, policy makers and service users.

Strengths and Limitations
 This is the first initiative to reduce heterogeneity in outcome reporting for interventions that improve discharge from acute mental health services.  A high level of consensus amongst 69 service users, families/carers, healthcare professionals, researchers and policy makers was achieved.  COS-STAR reporting guidelines were followed.  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y   3  Although the stakeholder group included international researchers, service users and healthcare professionals were recruited only from the UK.  Not all of the participants who contributed online attended the face-to-face meeting, whereby the core outcome set was reduced considerably.

Funding
This work was funded by the National Institute for Health Research (NIHR) Greater Manchester Patient Safety Translational Research Centre (NIHR Greater Manchester PSTRC). The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.

Competing Interests
The authors report no competing interests.

Key Words
 Acute Adult Mental Health Services  Core Outcome Set  Mental Health  Discharge  Care Transitions

Background
Care transitions (when patient care is transferred from one team, department or organisation to another) are widely recognised as a vulnerable and high-risk stage in the care pathway [1][2][3]. Safety issues may be intensified in acute mental health services, where care transitions are described as chaotic [3]. For example, suicide risk increases post-discharge from acute mental health services [4,5]. A growing body of research describes these risks either directly in terms of identified 'safety' events or indirectly in terms of broader 'problems', including for example treatment non-adherence, inappropriate readmissions, increased risk of self-injury or suicide attempts [3,[6][7][8].
Internationally, researchers have attempted to find solutions to the problems or threats to safety associated with discharge from acute mental health services by developing interventions that aim to improve different aspects of discharge planning, transitions, continuity of care, and follow-up care [9]. Some interventions aim to improve discharge by introducing new roles, for example a discharge co-ordinator [10]. Others focus on increasing contact between clinical staff and service users, for example using letters or telephone followup [11][12][13]. Many 'successful' interventions in reducing readmission, bridged the boundaries between ward and community by providing types of ward-based care in the community [14,15] or where community teams lead discharge planning on the wards [16], There has been little attempt to compare these diverse interventions. Existing reviews have included either a narrow range of studies addressing a single outcome or focus on a specific time frame in an attempt to synthesise results [8,17]. Comparison and meta-synthesis of effectiveness of interventions has reported limited success. Across the papers included in our systematic review and those by other researchers [1,18], variation in the outcomes reported is substantial. This limits between study comparability and delays advancement in evidence collection. Furthermore, outcomes in these trials were not necessarily representative of the measures that service users would consider important at discharge. Both matters can potentially be addressed with the development of a 'core outcome set', defined as "an agreed, standardised collection of outcomes which should be measured and reported, as a minimum, in all trials for a specific clinical area" [19].
The development and use of 'core outcome sets' has been endorsed as a means to reduce outcome heterogeneity in research, and to increase the relevance of research through the involvement of key stakeholders in its development [20]. There is an emerging body of literature highlighting the difficulties of defining and assessing outcomes in a mental health population [21]. There is also evidence of a lack of agreement amongst key groups about what should be measured and in what capacity and an evident tension between the population health perspective and provision of individualised care [17,21]. One aforementioned previous review identified the need for consensus on outcome definitions in discharge planning interventions [17]. Similarly a recent Kings Fund report suggested broader consensus upon the outcomes that matter is imperative for advancement [21]. Therefore, generating agreement amongst healthcare professionals, service users, policy makers and researchers is a difficult but imperative task, to enable the useful direction of healthcare services [21]. The difficulties are further exemplified when applied to care transitions, a multi-agency, multi-stage, complex period of the care pathway [3,22]. This paper outlines the development of a core outcome set for research of interventions to improve discharge from acute mental health wards to the community.
The objective of this study was to obtain international consensus on a set of core outcome measures to be reported in all interventions intended to improve discharge from mental health inpatient services.

Study overview
The scope of the core outcome set was defined according to the criteria recommended by Core Outcome Measures in Effectiveness Trials (COMET) [23]. The study was prospectively registered with the COMET initiative (1276). The health condition was functional conditions (mental disorders other than dementia, and includes severe mental illness such as schizophrenia). The population was adults aged 18-65, the intervention was any interventions that aimed to improve discharge from an acute mental health setting to the community. The core outcome set was developed using four stages, including service users and healthcare professionals at each stage: (1) a long list of outcomes was generated through a systematic review [1] and qualitative survey; (2) the resulting long outcome list was used to populate an online Delphi process (2 rounds); and (3) the results of the Delphi survey were appraised at a consensus meeting and the final core outcome set was established. The process included a series of core research team meetings at every stage, the team comprised of a researcher and core outcome set developer, an associate professor in mental health and mental health nurse, a researcher and expert by lived experience of acute services and an expert in patient safety. Participants did not fit into distinct homogeneous groups, for example mental health professionals were sometimes also past service users or family members of service users. Similarly researchers had personal experience of inpatient mental health services. Therefore, wherever possible we considered the group as whole and tried not to compare categories.

Participants
Participants were recruited in a number of ways in December 2018 to January 2019. Academic researchers were recruited if their research had been included in our systematic review or if they were known researchers in the field identified by the team. End-users of research (policy makers, NGOs, NHS management, commissioners, advocates etc.) were recruited via searching for publicly available contact details or using our team's professional networks or social media. Service users and healthcare professionals were recruited through social media. Twitter was nominated as the primary platform for recruitment due to its ability to reach into the specific communities of interest we required: mental health professionals, service users and families/carers. Using social media has been reported as a cost-effective and efficient way to recruit those from potentially stigmatised groups [24]. Further, the peer network structures of social media platforms enable users to recruit other users through sharing links within their networks.
The same participant group was used throughout the iterative research process, therefore, in order to reduce attrition, those who dropped out in early rounds were invited to re-join the panel in subsequent rounds. Participants were recruited for the consensus meeting during the Delphi, UK participants were asked to indicate whether they would be interested in a face-toface meeting. We invited a random sample of interested participants to attend, that ensured representative of the stakeholder groups. If a participant declined the invite a similarly matched participant was invited from the Delphi panel principally, or the teams wider network.

Stage 1: Gathering information
In addition to the outcomes extracted from the systematic review [1], outcomes of importance to each stakeholder group were identified through qualitative surveys. For the main body of the questionnaire, we used open questions that were developed to elicit potential additional outcomes. The questions were loosely modelled on questions developed for a large scale outcome generation study for a depression core outcome set which were developed with service users and healthcare professionals [25]. The question format was mirrored, but adapted for a mental health discharge theme. The views of a PPI (patient and public involvement), group sought to confirm appropriateness of questions and instructions (n=5).
After reading a participant information sheet and giving informed consent (by ticking a box), participants selected their stakeholder group (s) and watched a video that describes core outcome sets to non-experts. All participants were then presented with four open-ended questions relating to safe and effective discharge (see additional file 1). Participants were later presented with 3-5 questions specifically developed for their stakeholder group, additional file 1 outlines all of the questions. If a participant were a member of more than one Qualitative data was coded to identify outcomes and thematically synthesised [26]. This involved line-by-line coding of text and development of descriptive themes, the final stage involved generating analytical themes, which were converted into potential outcomes where applicable. Outcomes were identified both indirectly, by extrapolating from service users' experiences (e.g. What would make discharge from an acute mental health ward safe in your opinion?), and directly, by asking specifically about outcomes (e.g. Can you think of any important outcomes to measure in research assessing discharge interventions?).
Outcomes from the systematic review [1] and qualitative surveys were combined to generate a long list of outcomes. This list, along with relevant quotes from the qualitative data, was discussed by the core research team in a structured meeting. Each outcome was considered in turn and each member had the opportunity to present arguments for or against inclusion. For each outcome, the group decided whether it should be a stand-alone outcome, combined with other codes of a similar thematic nature or removed from the process due to being of limited importance for a core outcome set. For example, we agreed to merge closely related items (e.g. family relations and quality of interpersonal relationships) and to exclude outcomes considered to be of limited importance (e.g., specific to a specialised area of care: Autistic life; or intervention Antipsychotic Politherapy). Unless there was a unanimous decision to merge or remove an outcome, it remained as a stand-alone outcome. The group decisions about each outcome are documented in additional file 2.

Stage 2: Delphi survey
The Delphi technique is a research method aimed at generating consensus. It solicits opinions from stakeholders groups in an iterative process of answering questions. After each round the responses are summarised and redistributed for discussion in the next round. We chose to have two rounds of Delphi in this study. The final outcome list that was decided upon after the group discussion in stage 1 was used to develop the first Delphi questionnaire. Any outcomes without consensus after the first round, were re-presented in round 2. The outcome list and instructions for the questionnaires were reviewed for face validity, understanding, and acceptability by a PPI group (n=5) and modified according to feedback. We ran the Delphi survey manually using Qualtrics: a secure online hosting platform [27]. In each round, participants were asked whether the items should become part of a core outcome set. A 7-point Likert scale was used, described as: Strongly Agree (7), Agree (6), Slightly Agree (5), Neither Agree nor disagree (4), Slightly Disagree (3), Disagree (2), Strongly Disagree (1). There is no definitive research indicating the optimal number of points to have on a Likert scale but scales between 5 and 9 points have been suggested as having the best reliability, so we chose a 7 point scale [28]. There was a free-text comments box and participants were encouraged to provide comments that would be fed-back anonymously to the group. Participants could suggest additional outcomes at the end of round 1, which were reviewed by the core research team. Any outcome not already represented was added to round 2.
In round 2 median group scores for each outcome and anonymous comments for and against from the previous round were presented and participants were asked to reflect on the information presented and score each outcome again. The percentage of participant agreement with each outcome on a scale of 1-7 was calculated from the scores obtained during round 1 and again in round 2.
Literature suggests that consensus levels should be set a priori at a minimum of 70 percent [23,29]. We unanimously chose a 75% consensus level, slightly higher than the minimum to increase sensitivity, but to still allow for a varied pool of applicable outcomes given the tension in the literature around disagreement between service-user, health professional and policy-makers opinions of mental health outcomes [21]. Consensus criteria were defined a priori: outcomes scored as Agree or Strongly Agree (6-7) by 75% or more of the group reached consensus for inclusion and were included in the provisional core outcome set. Outcomes scored as Disagree or Strongly Disagree (1-2) by 75% or more were defined as having reached consensus for exclusion and were excluded. Outcomes not fulfilling criteria for consensus inclusion or exclusion were defined as not having reached consensus and were re-presented in round 2.
As no outcomes met the original criteria for having reached consensus for exclusion after round 1, it was agreed by the research team to redefine the criteria for having reached consensus for exclusion if 50% or less of participants scored the item as Strongly Agree or Agree (6-7). Reducing exclusion criteria after round 1 has been used effectively in past core outcome set research [30].

Stage 3: Consensus meeting
The results of the Delphi survey were presented at a consensus meeting. The main goal of the consensus meeting was to decide which items will be included in the final core outcome set. This was chaired by an independent researcher with expertise in consensus methodology, and who was not a member of the core research team. Participants were sampled to achieve a balanced representation of service users, health-care professionals, researchers and end-users of research. We aimed to have a small representative group of between 9 and 12 to enable meaningful small group discussions, similar to consensus meetings chaired by the facilitator in other fields [30,31]. International participation was restricted because of budgetary constraints.
The format of the consensus meeting comprised of a) a short overview of the study and b) a summary of the Delphi results sorted by stakeholder group, beginning with the outcomes that met consensus [32]. Outcomes identified in round 1 and 2 of the Delphi as having reached consensus for inclusion were presented first. Participants were asked if there were any fundamental reasons why these should not be included in the core outcome set. Divergent views were actively sought and the chair ensured everyone had opportunity to participate in discussions before voting commenced. Outcomes from the preliminary core outcome set were discussed in terms of feasibility and voted upon. Voting was conducted anonymously using cards in an envelope with bivariate response options (include/exclude). Voting and consensus criteria followed the same format as in the Delphi (75% for inclusion). Results were presented after the voting of all outcomes had finished. Outcomes deemed to be having reached consensus for exclusion or with no consensus in the Delphi were reviewed and participants were asked if there were any fundamental reasons why these should be included in the core outcome set. Individual outcomes were discussed only if proposed as being important by a meeting participant. Outcomes meeting criteria for consensus were included in the core outcome set; all other items excluded. The meeting finished with the presentation and a final review and discussion of the core outcome set.

Patient and Public Involvement
Five patient representatives worked with researchers to develop the online questionnaires. Patients were represented alongside professionals and researchers in the consensus panel.
One member of the research team (and co-author) is an expert by lived experience and was involved in all design and analysis decisions.

Ethics and registration
Our findings are reported in line with the Core Outcome Set-Standards for Reporting (COS-STAR) guidance [32]. The study was prospectively registered with the COMET initiative (1276). The study was approved by the University of Nottingham Business School ethics committee and all participants gave informed consent

Stage 1: Information gathering
Our systematic review has been described in detail elsewhere [1]. In summary 69 outcome categories were identified from 45 studies. Ninety-three participants in total, from 12 countries completed the information gathering questionnaire. However, as aforementioned, many identified with more than one stakeholder group, therefore we do not have absolute homogenous stakeholder group numbers, 27 identified as service users, 17 family/carers, 39 health-care professionals, 15 end-user of research and 37 researchers. Additional file 3 presents participants demographics. Qualitative questionnaires revealed an additional 45 outcomes that were not identified in the literature (for example, outcomes concerning involvement in discharge planning, see additional file 2). After discussion within the research team, 82 standardised outcome terms were taken forward into the Delphi process; 19 outcomes were combined/collapsed and 13 were removed, see additional file 2.

Stage 2: Delphi process
Sixty-nine participants completed round 1 of the Delphi (22 service users, families and carers, 26 researchers and 21 healthcare professionals and decision makers) and 68 participants completed round 2 (30 researchers, 18 service users and families and 20 healthcare professionals and decision makers). Whilst 5 participants dropped out after round 1, 4 participants joined the panel in round 2 (these individuals participated the qualitative questionnaire but not round 1). There was a 1.4% attrition between round 1 and 2 of the Delphi. Seven additional outcomes were proposed by participants during round 1, of which two were added into round 2 after a core team discussion. The full list of Delphi items is available in additional file 4.
After round 1, 14 outcomes met the criteria for consensus inclusion (75% or more agreed/strongly agree with that outcome, see table 1). Twenty outcomes met the revised criteria for having reached consensus for exclusion (50% or less of participants agreed/strongly agreed with that outcome). Forty-eight outcomes did not meet consensus criteria for inclusion or exclusion and were re-presented to the group in round 2. Therefore, 50 outcomes were presented in round 2, only one outcome met the criteria for consensus after this round: meaningful activity. No outcomes met criteria for exclusion and 49 did not meet consensus. Additional file 4 shows consensus levels for each outcome in each round.

Stage 3: Consensus Meeting
Eleven participants attended the consensus meeting, as in previous rounds these categories were not exclusive, six participants were researchers, three identified as service users, three as healthcare professionals and three end-users of research, see table 2. Table 3 shows the quantitative results of the meeting.
The preliminary 15-item core outcome set was considered individually and discussions indicated that many of the outcomes were elements of an ideal discharge, and process outcomes/variables, but probably not measurable outcomes that should be included in a core outcome set. After these discussions and independent and anonymous voting, five items no longer met consensus criteria for inclusion. First, 'service user involvement' in discharge planning, and the associated items 'Service user understanding of discharge plan', 'Service user involvement in decision making', 'Service user satisfaction with information provision at discharge'; and 'Service user knowledge of how to access community support'. There was a discussion that these are very important elements of a successful discharge, but not core outcomes due to issues surrounding validity and meaning.
'Mental health and illness' was initially close to consensus with 73% consensus to include, however those that chose to exclude found it to be too vague, and articulated that they were most interested in measuring acute psychological distress, rather than mental health and illness. The service user representatives in the group interpreted "recovery" to mean a complete amelioration of symptoms and even when in "recovery" individuals described continuing to experience distress and difficulties with their mental health. We chose to therefore separate the broader mental health and illness outcome into self-reported psychological distress and clinician reported mental health. The granular outcome of selfreported psychological distress resulted in 100% consensus to include. On the contrary clinician reported mental health did not meet consensus criteria (45%). Similar discussions happened around the recurrence (relapse) outcome, whereby its inclusion in a core outcome set, would ultimately necessitate buy-in to criteria model, which suggested that mental health problems could and should be completely resolved. Discussions around the 'suicide attempted' outcome indicated that participants felt that suicide attempts or self-harm had diverse motivations and definitions and they discussed the issues of delineating the boundaries of self-harm and suicide attempts and how this is documented. After the consensus meeting this outcome no-longer meet consensus criteria to include. Discussions surrounding personal recovery, functioning and meaningful activity indicated that participants considered these outcomes too vague and subjective to be a component of a core outcome set. There was consensus to exclude meaningful activity and recovery, and no consensus to include personal recovery. There was consensus to exclude discharge to appropriate accommodation, discussion indicated this was primarily because this spanned the health and social care boundaries and may not be applicable to every intervention.
On completion of the meeting, only four outcomes met consensus criteria for inclusion, see Table 4. A core outcome set of four was agreed, participants agreed that the following should be included: readmission, quality of life, suicide completed and service user reported psychological distress. Readmission was the most frequently used outcome in past research, and despite limitations, participants felt it was one of the only proxy measures of appropriate discharge. Quality of life and psychological distress were considered important ways of quantitatively assessing the psycho-social elements of discharge; which are of primary importance. Suicide completed was considered rare but imperative data to capture given the research highlighting the relationship between acute mental health discharge and suicide highlighted by a growing body of literature [5,33]. Figure 1 shows the process undertaken to reach the core outcome set.   This study provides the first international consensus on outcomes for intervention studies concerning discharge from an acute adult mental health inpatient setting. We could not identify any other published core outcome sets for interventions concerning discharge from acute mental health services. Moreover, there are very few core outcome sets for mental health, despite recommendations for consensus in the literature [18,21]. All the included outcomes were agreed upon by more than 75% of a group of relatively equally-represented service-users and family/carers, health-care professionals, researchers and end-user of research using consensus methods. We recommend that all future research studies evaluating interventions for discharge from acute adult mental health settings use this core outcome set as a framework for outcome selection, to compliment, rather than replace any other outcomes that are relevant to their research question. As discharge from acute services is a particularly challenging period for those experiencing mental health problems [3,33], it's important to understand what interventions work and more specifically which elements of an intervention improve which particular outcomes. This core outcome set provides a framework for between-study comparison, ultimately enabling researchers to articulate the theory of change that underpins interventions.
In our systematic review [1], we identified 22 studies that reported readmission rates as an outcome, yet almost all of them captured this in different ways: some used self-report data, some clinical case notes or some retrospective administrative data, others used case manager's reports. In addition the time markers were variable, some used country specific time markers in line with policy such as 28 days in the UK [34], whilst others chose a series of time markers such as within 1 month, 3 month and 6 months, but the time markers were rarely directly comparable. Similarly, six studies measured quality of life but, only two used the same measurement instrument (Lehman's Quality of Life) [16,35]. In the current study, we have developed consensus that Quality of Life and Readmission are important and feasible to measure, robust recommendations of how best to measure these are now needed.
There were some unexpected exclusions in the core outcome set, for example mental health symptoms and treatment adherence were frequently used in past research [1], but not included in the core outcome set. In the background of this paper we described the recent Kings Fund report that suggested generating agreement amongst healthcare professionals, service users, policy makers and researchers is a difficult but imperative task [21]. Our work reiterates this findings, and the small four-item core outcome set represents the only outcomes that are unanimously agreed upon, despite so many outcomes being of upmost importance to service-users and families. This research has further highlighted the importance of shared decision making and serviceuser and family involvement to all stakeholder groups [36].
This study indicates an impending desire to assess service-user satisfaction andinvolvement in the process. Whilst such outcomes, were excluded in later stages of this research, it does not reduce their prospective importance in discharge interventions or provision of care at discharge. The five most agreed upon elements of service-user involvement and satisfaction in discharge were: Service user involvement in discharge planning; Service user understanding of discharge plan; Service user involvement in decision making; Service user satisfaction with information provision at discharge; Service user knowledge of how to access community support. Policy makers and healthcare management might consider measuring these five things in local level initiatives as overriding principles of care to ensure they are not missing from care provision. Research highlighting the importance of involving service users in mental health care planning is emerging, along with measures of such activity. Therefore, we suggest that future research could include a service user reported outcome measure of involvement alongside the 4-item core outcome set and any other chosen measures. This could be measured in an existing instrument of service-user involvement care planning in mental health, such as the EQUIP PROM (Patient Reported Outcome Measure) [36]. The 6 outcomes described above can also be presented as self-reported Likert measure of service user involvement in discharge planning (see additional file 5). These 6 items are developed from synthesis of academic literature, qualitative questionnaires and met criteria for consensus amongst experts in round 2, so from a psychometric perspective would arguably meet initial face and content validity criteria [37].
The difficulties of developing a mental health core outcome set was further epitomised when applied to care transitions: a service-level (rather than specific clinical population) multiagency, multi-stage, complex period of the care pathway [3,22]. Generating a set of meaningful applicable outcomes that span primary and secondary care, across multiple physical locations, that are relevant for every service user was imperative. For example, a great deal of past literature focuses on housing interventions [38][39][40], and whilst housing is a significant safety issue at discharge, it's not necessarily relevant to all service users. This multi-agency, multi-morbidity complexity was arguably one factor that resulted in the small set of generic outcomes, that arguably differs from narrowly defined clinical core outcome set reported in past literature of many more outcomes [41,42].
This study had several strengths. Our method is based on recommendations from an international panel of experts [23]. Inclusion of service-users and health-care professionals at every stage ensured that outcomes in the final core set embody shared priorities. The comprehensive and laborious long-list process ensured all potential outcomes were considered in the course of the consensus process. However, there were some limitations to our study. The research was only conducted in English, due to budgetary constraints, although our online rounds included participants from 12 countries. Furthermore, in many consensus meetings additional outcomes are often added, the method infrequently serves as means of reducing the number of outcomes included in the preliminary core outcome set from the Delphi [30]. However, in our case we found that the group did not agree with many of the outcomes and it was reduced to a very small COS of 4 items. This is beneficial in some ways, as we hope it is easier for researchers to operationalise a four-item core outcome set.
The use of outcomes in mental health research and service, is becoming more contested in terms of what is meaningful and effective, it could be argued that core outcomes sets are less applicable to mental health populations than general health populations, given the complexity of mental health problems and the subjectivity of measuring it. However, as core outcomes sets are relatively uncommon in mental health, we believe (similar to other clinical populations) a small, agreed, feasible set of core outcomes will facilitate between study comparability and advancement in evidence collection [19,23].

Future Directions
Development of this core outcome set involved participation of stakeholders from 12 different countries; (primarily researchers) however, we recommend that further work should be undertaken to validate this core outcome set more widely, particularly in non-English  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y   15 speaking populations. The two of the final four outcomes and many of the preliminary 15 outcomes to emerge from the Delphi, are not necessarily specific to mental health care transitions. Some outcomes are comparable to a similar core outcome set for care transitions of adolescents and young adults with special healthcare needs [43]. Future research may consider a 'transitions of care' core outcome set, to reduce the number of similar core outcome sets.
Another key priority to make this core outcome set operationalised is to agree upon measurement criteria using the COSMIN guidelines [44]. We conducted some preliminary questionnaires with the Delphi panel to produce preliminary measurement recommendations, however there was very little agreement amongst panellists (see additional file 6). Another key priority to make this core outcome set operationalised is to agree upon measurement criteria using the COSMIN guidelines [44]. We conducted some preliminary questionnaires with the Delphi panel to produce preliminary measurement recommendations, however there was very little agreement amongst panellists (see additional file 6). The recommended measures by the panel were Kessler Psychological Distress (K10) and Recovery Quality of Life (ReQoL) within one month of discharge [45,46]. Readmission and suicide completed rates were recommended to be captured within 28 days of discharge using retrospective review of administrative data. However, these are only preliminary recommendations and we highly recommend a future study following COSMIN guidelines.

Conclusion
The four outcomes included in our outcome set represent the consensus opinion of a group of service-users, health-care professionals, and international researchers and addresses an unmet necessity: assisting researchers in the design, implementation and reporting of interventions that aim to improve discharge from acute mental health settings. Ultimately, application of this core outcome set will enhance the relevance of future interventions to health-care professionals, the research community and service-users. If used, the core outcome set could provide more evidenced-based interventions, underpinned by theory of change outlining the relationships between the component of the intervention and the outcome it should improve [1,47]; which should increase service-user safety at this distressing time period.

Patients' consent and permission to publish
All participants gave informed consent to be involved in the study ahead of data collection.

Methods
After the core outcome set was agreed in the consensus meeting, we invited all participants from the earlier stages of the project to recommend measures and time markers in a final online questionnaire. Participants were invited to participate if they had been involved in any of the previous online rounds. The invitation made it clear that the questionnaire is most relevant to researchers, but that other groups with an opinion or interest are welcome to contribute. This was due to the specific knowledge of instruments required to complete this round.
In this questionnaire participants were presented with the four core outcomes. For each core outcome they were presented with any measures used to assess that outcome in our systematic review studies [1] and any additional measures that had been recommended to the team during the process. Participants were asked to choose the one most appropriate, (don't know, other, new instrument, no instrument were also options). A second question also asked which time markers would be recommended, with options to select all applicable. These options were also developed based on time markers used in the systematic review [1].

Results
Forty-three of the 93 invited participants responded (15 service users, 8 family members/carers, 23 researchers, 10 healthcare professionals, 3 end users of research), although as in previous rounds these were not distinct categories. Fifty-three percent of the respondents were researchers, this was expected as in the email we suggested that this stage may be more meaningful or of interest to this group, but as a team we chose not to exclude other groups with opinions on measurement instruments. Twenty-three percent of participants were international  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y   12 researchers (from USA, Switzerland, Canada, and Australia). Table 4 shows the preliminary minimum measure recommendations and time markers, additional file 5 shows the results upon which the recommendations were based. One month post-discharge Quality of Life ReQoL-10 One month post-discharge

Readmission
A minimum recommendation of using retrospective review of administrative data for readmissions within a defined time period, the most agreed was 28 days. Participants indicated that routine data collection might cover slightly different time periods. Twenty-six of the 43 participants recommended a measure of around a month (1 month, 30 days or 28 days) with 28 days being this most popular. However, they also advise that this should be supplemented either by cross-checking with service-users, case managers or carers, where possible to improve quality of data. Those looking for more comprehensive data may also like to record 7 days, 3 months and 6 months as these were also popular recommendations.

Quality of Life
The participants recommended that researchers use the Recovering Quality of Life (ReQoL-10) at one month post-discharge [35]. This was the most recommended instrument by the group. However, many participants also voted for ReQoL 20, a large proportion of the group suggested this outcome. As this is a quality of life measure specific to mental health recovery they felt this is most appropriate. The one month time marker is in-keeping with the other COS time frames, making a more comprehensive and accessible core outcome set. Those using within-participant measures of quality of life may like to also measure a pre-discharge baseline. Researchers looking for more thorough assessment of quality of life may like to also measure at 7 days post-discharge and 3 months, as these were also highly recommended time markers or use the ReQoL 20 and report both scores for comparability.

Suicide Completed
The participants recommended retrospective review of administrative data, for suicide completed within 28 days of discharge. Retrospective review is in line with other outcomes and was marginally the highest suggestion. We chose within 28 days for consistency with readmission data. Researchers looking for more comprehensive data may want to use 7 days and 3 months as these were highly recommended also. They may also want to cross-check this information against other sources (carers/case managers) to ensure it is correct and reported, particularly as participants mentioned the impact of incorrect coroner's reports on such data.

Psychological Distress
The participants recommended Kessler Psychological Distress (K10) one month postdischarge [36] . For consistency with other outcomes we recommend measure at one month. Seven days and 3 months are also highly recommended, so we would recommend these for research that is more robust. Although there were very few votes for instruments for psychological distress and qualitative comments revealed that participants felt this is not measurable. The same amount of people who voted for K10 also voted for interviews or other measures and a similar amount recommended the development of a new measure, CORE-10 was similarly close [37]. Whilst we make this recommendation, we also suggest that future researchers may look to develop something specific for Psychological Distress in this core outcome set. Interviews would not effectively facilitate the between study comparison, the key purpose of a COS.

Important comments
Interviews with SUs 12  In some countries… there is no easily accessible data on readmission rates…in our experience self-reported in the most reliable way Retrospective review of administrative data 13  Might not show people who need admission but don't because there's no bed Extracted from case-managers notes and cross-checked with hospital records 10  Might be easier to gather administrative data, but worth cross-checking to improve quality of information Self-reported questionnaire 1 Other-carer interview 2 Total 38 Conclusion: A minimum recommendation of using retrospective review of administrative data. This will allow for various studies with diverse time and financial limits to use the COS. However, we also advise that this should supplemented either by cross-checking with service-users, case managers or carers, where possible to improve quality of data.

Time Markers
Time Conclusion: The minimum recommendation to record readmission within 28 days. 26 of the 43 participants recommended a measure of around a month (1 month, 30 days or 28 days) with 28 being this most popular. Those looking for higher quality or more comprehensive data may also like to record 7 days, 3 months and 6 months as these were also popular recommendations. I think the use of tools should be complemented with interviews with service users and carers. Total 33 ReQoL combined (10+20) 12 Conclusion: We recommend that researchers use ReQoL-10. This was the most voted for instrument. If we also combine the scores with those who voted for ReQoL 20, a large proportion of the group suggested this outcome. As this is a quality of life measure specific to mental health recovery we feel this is most appropriate.  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y   15 also measure a pre-discharge baseline. Those looking for more thorough assessment of QoL may like to also measure at 7 days post-discharge and 3 months, as these were also highly recommended time markers.  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y 16 CORE-10 was similarly close. Whilst we make this recommendation, we also suggest that future researchers may look to develop something specific for this core outcome set. Interviews would not allow for easy comparison of scores so would not be relevant for a core outcome set.