Translating staff experience into organisational improvement: the HEADS-UP stepped wedge, cluster controlled, non-randomised trial

Objectives Frontline insights into care delivery correlate with patients’ clinical outcomes. These outcomes might be improved through near-real time identification and mitigation of staff concerns. We evaluated the effects of a prospective frontline surveillance system on patient and team outcomes. Design Prospective, stepped wedge, non-randomised, cluster controlled trial; prespecified per protocol analysis for high-fidelity intervention delivery. Participants Seven interdisciplinary medical ward teams from two hospitals in the UK. Intervention Prospective clinical team surveillance (PCTS): structured daily interdisciplinary briefings to capture staff concerns, with organisational facilitation and feedback. Main measures The primary outcome was excess length of stay (eLOS): an admission more than 24 hours above the local average for comparable patients. Secondary outcomes included safety and teamwork climates, and incident reporting. Mixed-effects models adjusted for time effects, age, comorbidity, palliation status and ward admissions. Safety and teamwork climates were measured with the Safety Attitudes Questionnaire. High-fidelity PCTS delivery comprised high engagement and high briefing frequency. Results Implementation fidelity was variable, both in briefing frequency (median 80% working days/month, IQR 65%–90%) and engagement (median 70 issues/ward/month, IQR 34–113). 1714/6518 (26.3%) intervention admissions had eLOS versus 1279/4927 (26.0%) control admissions, an absolute risk increase of 0.3%. PCTS increased eLOS in the adjusted intention-to-treat model (OR 1.32, 95% CI 1.10 to 1.58, p=0.003). Conversely, high-fidelity PCTS reduced eLOS (OR 0.79, 95% CI 0.67 to 0.94, p=0.006). High-fidelity PCTS also increased total, high-yield and non-nurse incident reports (incidence rate ratios 1.28–1.79, all p<0.002). Sustained PCTS significantly improved safety and teamwork climates over time. Conclusions This study highlighted the potential benefits and pitfalls of ward-level interdisciplinary interventions. While these interventions can improve care delivery in complex, fluid environments, the manner of their implementation is paramount. Suboptimal implementation may have an unexpectedly negative impact on performance. Trial registration number ISRCTN 34806867 (http://www.isrctn.com/ISRCTN34806867).


Conclusions
High fidelity PCTS implementation reduced eLOS for general medical patients, and improved interdisciplinary team outcomes. Future studies should focus on improving the implementation of prospective frontline surveillance strategies, to realise their full benefits.

STRENGTHS AND LIMITATIONS OF THIS STUDY
This is the first formal evaluation of a prospective frontline surveillance strategy on general medical wards.
The study was well powered to detect a significant change in its primary outcome, complemented by a mixed methods assessment of secondary patient and clinical team outcomes.
The pragmatic nature of the trial increases the generalisability of its findings.
It can be difficult to disentangle the effects of interdisciplinary interventions from the characteristics of the teams that implement them best, although our statistical model allowed for varying intervention fidelity by each team over the course of the study.
Contamination between the groups, whereby additional organisational support was directed to those wards in the control period, may have reduced the intervention's measurable effect.

INTRODUCTION
Frontline staff have a particular insight into the safety and quality of inpatient care.
Favorable staff perceptions of care correlate with improved clinical outcomes, including the likelihood of patient survival and harm-free care. 1,2 Yet frontline concerns about the practical delivery of care do not feature in national safety initiatives, nor are they a priority at a local level. 3,4 General medical wards, in particular, struggle to mobilise organisational resources -even though preventable hospital deaths are disproportionately caused by ward failings. 5 Hospital leaders acknowledge the need to access frontline insights, especially those that might challenge their assumptions about organisational performance. 6 Yet leaders are wary of acting too quickly, for fear of wasting resources on unvalidated concerns. 6 At the same time, actionable quality and safety concerns -identified by  6 medical teams in the course of their normal work 7-9 -remain hidden from organisational view. These concerns are typically addressed by the teams themselves, using superficial workarounds that allow clinical work to continue but do not prevent the problem's recurrence. 10 Temporary fixes are encouraged by a professional pride in troubleshooting, 11 and organisational traditions that emphasise individual vigilance. 10 Better methods are needed to capitalise on frontline experiences and translate them into organisational action.
An ideal system for effective resolution of frontline concerns would therefore: (i) promote the systematic identification of safety concerns by frontline staff; (ii) motivate and enable them to resolve unit-level issues within their control; (iii) record frontline successes and challenges to build a richer understanding of safety and resilience across the organisation; and (iv) engage leadership in attending to frontline concerns. We aimed to achieve these goals through prospective clinical team surveillance (PCTS) -a program combining structured, frontline interdisciplinary briefings with facilitated organisational escalation of the issues they identified, and feedback. The effects of interdisciplinary team interventions in general medical settings have been mixed, 12,13 and recent editorials have called for adequately powered studies with designs that better account for confounding factors. 14 Here, we evaluate PCTS in a pragmatic, stepped wedge, cluster controlled trial, reporting its effects on patient outcomes, escalation of care, and staff safety attitudes and behaviors.

Intervention program i) Structured team briefing
The PCTS program was based on published observational studies showing that medical teams can engage in the rapid identification and review of potential adverse events. 7,[15][16][17][18] We designed a daily interdisciplinary briefing for structured team self-

ii) Facilitation
The second component of the intervention was facilitation, advancing the issues raised in the HEADS-UP briefings to bring about tangible unit-level and organisational changes. Facilitation helps to make sense of quality improvement interventions, aligning them with their participants and the surrounding context. 24 It is necessarily opportunistic and malleable, taking advantage of existing organisational levers. Here, it involved (for example) working with frontline teams to identify and document their areas of concern more effectively; championing those concerns in meetings with service leaders and safety committees; and following up on subsequent agreed actions when other priorities threatened their resolution.
Other studies that have used a facilitator in a similar way have described the role as an 'animateur', bringing on board people over whom he has no direct managerial authority. 25

iii) Feedback
The third component of the intervention was feedback to participating teams, managers, governance committees, and senior executives. Feedback summarised and disseminated the information collected in the daily briefings, highlighting HEADS-UP performance in each area, common concerns and challenges, and recurrent or unresolved problems that might require additional support. HEADS-UP data were provided on request to service leads, to support their business planning.
System changes arising from HEADS-UP were publicised, e.g., in existing departmental meetings, and via email and posters. As much as possible, feedback delivery was timely, focused on solution-finding, signposted to relevant resources, and adopted a non-judgmental approach. 27 Facilitation and feedback were provided by the program lead.
Although described separately here, we hypothesised that the intervention components would be interlinked, in that changes arising from the program's facilitation and feedback would motivate increasing engagement with HEADS-UP briefings, in turn increasing their ability to bring about change. We anticipated that improvements in interdisciplinary team care effectiveness would be brought about by parallel improvements in ward teams' function, and incremental support service improvement in response to escalation of their concerns.

Study design and patients
As described in the published protocol, 28 this stepped wedge trial was conducted on seven medical wards at two UK hospitals (Table 1). Interdisciplinary ward teams were assigned to implement the intervention at intervals, such that by the end of the trial all teams had adopted the intervention and had contributed both controland intervention-group data. Stepped wedge designs are increasingly used to Adult patients admitted to participating wards during the study period were eligible.
Exclusion criteria were: (1) less than 50% of the hospital admission spent on the specified ward; (2) discharge to a new skilled care facility (not the patient's previous address) or other hospital; (3) multiple ward transfers; (4) admission to the high dependency unit or ICU; (5) elective admission, or direct admission from another hospital; and (6) surgeon-directed care for more than 24 hours of the admission.
The study was approved as a quality improvement program by research and development authorities at participating institutions.

Study outcomes
The primary outcome was excess length of stay (eLOS). eLOS was a binary variable: an admission more than 24 hours above the local average for similar patients, as classified by their Healthcare Resource Groups. 31 Local institutions contributing comparator data were four nearby hospitals, broadly subject to the same community service restrictions and healthcare economy demands.

Sample size
Power calculations were conducted for the primary outcome (eLOS), using the recommended methodology for stepped wedge trials. 34 We adopted a conservative intraclass correlation coefficient of 0.06, two-sided p<0.05, and iterated calculations based on the wards with the highest and lowest baseline outcome rates. 28 With  10 seven wards admitting 7840 eligible patients (560 per month), the trial's power to detect a 2-14% absolute risk reduction lay between 75% and 100%. 28 Other stepped wedge trials have similarly produced a range within which the study power might lie. 35

Allocation and blinding
Individual patients were not recruited separately to the study, so we did not anticipate significant bias due to lack of allocation concealment. 36 Staff could not be blinded to their ward's assignment, due to the nature of the intervention. Clinical outcome data and escalation of care data were extracted by local administrative staff blinded to the study, as part of their ordinary duties. Hospital peer group data were generated by CHKS (Alcester, UK) and Dr Foster (London, UK), also blinded to intervention group.

Statistical analysis
Multi-level mixed-effects models (Stata, v14.1) were used to evaluate the intervention's effect on each patient-and ward-level outcome. This statistical approach accounts for clustering of outcomes within wards, and repeated measurements over time, representing ward-level variance in patient outcomes as a random effect. For the binary patient-level outcomes, binary logistic models were used; for counts data (complications of care and processes of care), we used Poisson loglinear models. Analyses involving eLOS were restricted to those patients who survived to discharge; no patients were excluded on the basis of 'outlier' length of stay. General linear models were used to analyse survey data (SPSS, v22).
In addition to the intention-to-treat analysis, we conducted a per protocol analysis, evaluating the effect of briefing implementation fidelity. Based on observations during the pilot period, we described implementation fidelity each month as the product of briefing frequency and engagement. High fidelity required both high briefing frequency and correspondingly intensive documentation by the ward.
Frequency was coded into three categories: high (≥75% working days that month), medium (50-75%), and low (<50%). Monthly engagement was defined as high  11 (documentation of more than the median number of issues) or low (below the median). An interaction term 'frequency*engagement' was incorporated into each model, with codes for high, moderate, low and poor monthly implementation fidelity. Note that this per protocol analysis, which calculated different implementation fidelity codes for each ward-month, allowed for varying fidelity by a single ward team over the course of the study.
Patient-level outcomes were adjusted for age, Charlson comorbidity index, 37 and palliation status, as well as ward admissions and temporal trends. Ward-aggregate outcomes were adjusted for seasonal and temporal trends, the median Charlson score of patients on the ward that month, and the rotation of interns between departments. Survey analysis defined HEADS-UP participation as self-reported engagement in 5 or more briefings, and adjusted for self-reported workload on the NASA Task Load Index scale, 38 as well as hospital site and time period.  Figure 1].

Secondary outcomes
Changes in processes of care and incident reporting were also dependent on  43 The effect of the intervention (when applied at low fidelity) would therefore be confounded by these changes.
Second, it is possible that underuse of the HEADS-UP briefings in some areas was a marker of wider team dysfunction and poor performance, rather than itself directly leading to poorer outcomes. Teams that participate more wholeheartedly in trials may be more innovative, with strong leadership, a readiness for change, and better managerial relations. 44 It is difficult to entirely separate the intervention's effects from the characteristics of the teams that implemented it best. Nonetheless, our statistical model accounted for ward-level variation in outcomes, and allowed the same team to contribute data to different categories of implementation fidelity, depending on their actions each month. Direct observation also showed multiple teams using briefings meaningfully, even during periods of great pressure. It is unlikely, therefore, that our results are a simple reflection of teams' pre-existing practice or fluctuations in their workload. However, interdisciplinary norms almost certainly played a role in how the intervention was perceived and adopted, and ultimately whether or not it was beneficial.

Objectives:
Frontline insights into care delivery correlate with patients' clinical outcomes. These outcomes might be improved through near-real time identification and mitigation of staff concerns. We evaluated the effects of a prospective frontline surveillance system on patient and team outcomes.

Design:
Prospective, stepped wedge, non-randomised, cluster controlled trial; pre-specified per protocol analysis for high fidelity intervention delivery.

Participants:
Seven interdisciplinary medical ward teams, from two hospitals in the United Kingdom.

Intervention:
Prospective clinical team surveillance (PCTS): structured daily interdisciplinary briefings to capture staff concerns, with organisational facilitation and feedback.

Main measures:
The primary outcome was excess length of stay (eLOS): an admission more than 24 hours longer than the local average for comparable patients. Sustained PCTS significantly improved safety and teamwork climates over time.

Conclusions:
This study highlighted the potential benefits and pitfalls of ward-level interdisciplinary interventions. Whilst these interventions can improve care delivery in complex, fluid environments, benefiting team and patient outcomes, the manner of their implementation is paramount. Suboptimal implementation may have an unexpectedly negative impact on performance.

STRENGTHS AND LIMITATIONS OF THIS STUDY
This is the first controlled evaluation of a prospective frontline surveillance strategy on general medical wards.
The pragmatic nature of the trial increases the generalisability of its findings.
It can be difficult to disentangle the effects of interdisciplinary interventions from the characteristics of the teams that implement them best.
This was a non-randomised study, adopting a necessarily pragmatic approach to the order in which participating teams introduced the intervention.
Contamination between the groups, whereby PCTS-generated organisational support was directed to those wards in the control period, may have reduced the intervention's measurable effect.

Study design and patients
We conducted a prospective, interventional, non-randomised stepped wedge trial on seven medical wards at two NHS hospitals [ Table 1]. The trial was described in a published protocol. 9 Interdisciplinary ward teams were assigned to a multifaceted quality improvement intervention. Wards introduced the intervention at staged intervals over the study period, such that (by the end of the trial) all teams had adopted the intervention and contributed both control-and intervention-group data.
Stepped wedge designs are increasingly used to evaluate service-level interventions in acute care. [10][11][12] The stepwise implementation protocol is helpful when simultaneous rollout of the intervention would be impractical for logistical reasons. 13 Baseline data collection began in August 2013, with implementation of the intervention between December 2013 and February 2015.
Medical ward teams with an existing structure for daily interdisciplinary team meetings, and their managers, were invited to take part. All the approached teams agreed to participate. The order in which wards adopted the intervention was pragmatically guided by local constraints. It would have been counter-productive to insist on a fully randomised implementation schedule. Instead, we sought input from senior ward staff and nursing leadership, trying to identify when they felt they could support the intervention's introduction. In practice, personnel and organisational changes meant that the anticipated leadership support could not be guaranteed.
Similar pragmatic approaches have been used in other stepped wedge evaluations. 14 A schematic of the ward-level implementation schedule is provided in the online supplement [supplement Figure 1].
Adult patients admitted to participating wards during the study period were eligible.  Other studies have used similar criteria to identify the patients whose outcomes might be most affected by service-level interventions. 15 The included patient group made up around 90% of all inpatients on these wards during the study period.
Participating teams were teaching teams with attendings (consultants), residents (specialty trainees), and interns (foundation doctors). Interdisciplinary staff also included ward clerks and nurses, who were typically based on a single ward. Doctors were based largely on one ward, but with some patient commitments elsewhere in the hospital. Physiotherapists and occupational therapists worked on multiple wards. Neither managers nor clinical staff had protected time for quality improvement initiatives.

Intervention programme i) Structured team briefing
The PCTS programme was based on observational studies showing that medical teams can engage in the rapid identification and review of potential adverse events. 7,[16][17][18][19] We designed a daily interdisciplinary briefing for structured team self- interventions advise a degree of local flexibility: 23 other teams were able to make minor changes to their unit's pro forma, whilst retaining the overall format. This flexibility maintains the 'hard core' of the intervention, and permits a 'softer periphery' to maximise its uptake. Briefings ended with a team agreement on how to resolve, or escalate, the concerns that had been discussed. After a short period of

ii) Facilitation
The second component of the intervention was facilitation, advancing the issues raised in the HEADS-UP briefings to bring about tangible unit-level and organisational changes. Facilitation helps to make sense of quality improvement interventions, aligning them with their participants and the surrounding context. 25 It is necessarily opportunistic and malleable, taking advantage of existing organisational levers. Here, it involved (for example) working with frontline teams to identify and document their areas of concern more effectively; championing those concerns in regular meetings with service leaders and safety committees; and following up on subsequent agreed actions when other priorities threatened their resolution. Other studies that have used a facilitator in a similar way have described the role as an 'animateur', bringing on board people over whom he has no direct managerial authority. 26,27

iii) Feedback
The third component of the intervention was feedback to participating teams, managers, governance committees, and senior executives. Feedback summarised and disseminated the information collected in the daily briefings, highlighting HEADS-UP performance in each area, common concerns and challenges, and recurrent or unresolved problems that might require additional support. HEADS-UP data were provided on request to service leads, to support their business planning.
System changes arising from HEADS-UP were publicised, e.g., in existing departmental meetings, and via email and posters. As much as possible, feedback delivery was timely, focused on solution-finding, signposted to relevant resources,  10 and adopted a non-judgmental approach. 28 Facilitation and feedback were provided by the programme lead.
Although described separately here, we hypothesised that the intervention components would be interlinked, in that changes arising from the programme's facilitation and feedback would motivate increasing engagement with HEADS-UP briefings, in turn increasing their ability to bring about change. We anticipated that improvements in interdisciplinary team care effectiveness would be brought about by parallel improvements in ward teams' function, and incremental support service improvement in response to their concerns.

Setting
Characteristics of the two participating institutions are provided in Table 1. Most study wards were in a community general hospital in London. Both institutions faced significant challenges during the study period, with significant turnover in senior staff, mounting service pressures and financial restrictions. One hospital was in the process of a merger, and underwent a major inspection by the healthcare regulator (the Care Quality Commission). The other hospital was introducing an electronic health record.

Study outcomes
The primary outcome was excess length of stay (eLOS). eLOS was a binary variable, representing an admission more than 24 hours above the patient's expected length of stay. Benchmarks for expected length of stay were generated using patient-level Healthcare Resource Groups 29 data from a network of four local hospitals. These hospitals were subject to the same community service restrictions and healthcare economy demands as the study sites. The study design therefore evaluated the extent to which intervention and control wards met this local standard. baseline and six months into the intervention period, using the relevant subscales from the Safety Attitudes Questionnaire. 30,31 Secondary outcomes (and the per protocol analysis described below) were pre-specified.

Data collection
Anonymised patient-level and ward-level outcomes were extracted from routinelycollected data sets. Potential participants in the HEADS-UP briefings were invited to complete anonymous surveys at baseline and six months later, either by submitting responses electronically (via Survey Monkey) or via a paper questionnaire. Overall response rates were calculated using contemporaneous staffing rosters.

Sample size
Power calculations were conducted for the primary outcome (eLOS), using a recommended methodology for stepped wedge trials. 32 As described in detail in the published protocol, 9 we adopted a conservative intraclass correlation coefficient (ICC) of 0.06, based on the ICCs for length of stay and appropriateness of stay in trials of acute care pathways. 33 We iterated calculations based on the wards with the highest and lowest baseline eLOS rates. 9 With a two-sided p<0.05, a sample size of 7840 patients was needed to detect a 2-14% absolute risk reduction with a power between 75% and 100%. 9 This approach to sample size calculations and study power has been used for other stepped wedge trials. 34

Allocation and blinding
Individual patients were not recruited separately to the study, so we did not anticipate significant bias due to lack of allocation concealment. 35 Staff could not be blinded to their ward's assignment, due to the nature of the intervention. Clinical outcome data and escalation of care data were extracted by local administrative staff blinded to the study, as part of their ordinary duties. Hospital peer group data were generated by the data extraction services CHKS (Alcester, UK) and Dr Foster (London, UK), also blinded to intervention group.

Statistical analysis
Multi-level mixed-effects models (Stata, v14.2) were used to evaluate the intervention's effect on each patient-and ward-level outcome. This statistical approach accounts for clustering of outcomes within wards, and repeated measurements over time, representing ward-level variance in patient outcomes as a random effect. For the binary patient-level outcomes, binary logistic models were used; for counts data (complications of care and processes of care), we used Poisson loglinear models. Analyses involving eLOS were restricted to those patients who survived to discharge; no patients were excluded because of 'outlier' length of stay.
General linear models were used to analyse survey data, using a difference-indifference approach with an interaction code for time*intervention at each site (SPSS, v22).
In addition to the intention-to-treat analysis, we conducted a per protocol analysis, evaluating the effect of briefing implementation fidelity. Based on observations during the pilot period, we described implementation fidelity each month as the product of briefing frequency and engagement. High fidelity required both high briefing frequency and correspondingly intensive documentation by the ward. High frequency was coded if briefings took place on ≥75% working days that month. This decision was reviewed at the authors' request, and the study was registeredprior to completion of data collection -in the ISRCTN registry (https://dx.doi.org/10.1186/ISRCTN34806867). 9 Staff were aware that the service development was being formally evaluated. As is routine for this type of intervention, we did not seek participant-level consent.  briefing frequency and the probability of eLOS highlighted different outcomes with high fidelity implementation and lower fidelity implementation [ Figure 2]. The plot models engagement and briefing frequency as continuous variables within a multilevel model to help visualise their interaction.

Secondary outcomes
No changes were seen in readmissions or the composite of deaths/readmissions, irrespective of implementation fidelity (all confidence intervals include unity; all p values >0.2). Escalation events and complications of care were also unchanged [ Table 2].

DISCUSSION
In this stepped wedge cluster controlled trial, intention-to-treat and per protocol analyses produced conflicting results. eLOS increased overall with PCTS, but was reduced by high fidelity PCTS implementation. Amongst the secondary outcomes, safety and teamwork climates improved with sustained PCTS implementation. High fidelity PCTS increased both incident reporting by non-nursing staff, and the number of non-falls reports.
There are several possible explanations for the tension between the two analyses.
First, faithful attention to quality improvement efforts may indeed result in worse outcomes, perhaps by distracting attention from existing good practice. 38 This is unlikely to have been the mechanism here: the wards that dedicated most time and effort to the intervention saw improved outcomes. Second, there may have been unmeasured confounders distinguishing between high fidelity and low fidelity PCTS

CONFLICTS OF INTEREST
NS is the director of London Training & Safety Solutions Ltd, which delivers team assessment and training to hospitals on a consultancy basis.

Objectives:
Frontline insights into care delivery correlate with patients' clinical outcomes. These outcomes might be improved through near-real time identification and mitigation of staff concerns. We evaluated the effects of a prospective frontline surveillance system on patient and team outcomes.

Design:
Prospective, stepped wedge, non-randomised, cluster controlled trial; pre-specified per protocol analysis for high fidelity intervention delivery.

Participants:
Seven interdisciplinary medical ward teams, from two hospitals in the United Kingdom.

Intervention:
Prospective clinical team surveillance (PCTS): structured daily interdisciplinary briefings to capture staff concerns, with organisational facilitation and feedback.
Sustained PCTS significantly improved safety and teamwork climates over time.

STRENGTHS AND LIMITATIONS OF THIS STUDY
This is the first controlled evaluation of a prospective frontline surveillance strategy on general medical wards.
The pragmatic nature of the trial increases the generalisability of its findings.
It can be difficult to disentangle the effects of interdisciplinary interventions from the characteristics of the teams that implement them best.
This was a non-randomised study, adopting a necessarily pragmatic approach to the order in which participating teams introduced the intervention.

Study design and patients
We conducted a prospective, interventional, non-randomised stepped wedge trial on seven medical wards at two NHS hospitals [ Table 1]. The trial was described in a published protocol. 9 Interdisciplinary ward teams were assigned to a multifaceted quality improvement intervention. Wards introduced the intervention at staged intervals over the study period, such that (by the end of the trial) all teams had adopted the intervention and contributed both control-and intervention-group data.
Stepped wedge designs are increasingly used to evaluate service-level interventions in acute care. [10][11][12] The stepwise implementation protocol is helpful when simultaneous rollout of the intervention would be impractical for logistical reasons. 13 Baseline data collection began in August 2013, with implementation of the intervention between December 2013 and February 2015.
Medical ward teams with an existing structure for daily interdisciplinary team meetings, and their managers, were invited to take part. All the approached teams agreed to participate. The order in which wards adopted the intervention was pragmatically guided by local constraints. It would have been counter-productive to insist on a fully randomised implementation schedule. Instead, we sought input from senior ward staff and nursing leadership, trying to identify when they felt they could support the intervention's introduction. In practice, personnel and organisational changes meant that the anticipated leadership support could not be guaranteed.
Similar pragmatic approaches have been used in other stepped wedge evaluations. 14 A schematic of the ward-level implementation schedule is provided in the online supplement [supplement Figure 1].
Participating teams were teaching teams with attendings (consultants), residents (specialty trainees), and interns (foundation doctors). Interdisciplinary staff also included ward clerks and nurses, who were typically based on a single ward. Doctors were based largely on one ward, but with some patient commitments elsewhere in the hospital. Physiotherapists and occupational therapists worked on multiple wards. Neither managers nor clinical staff had protected time for quality improvement initiatives.

Intervention programme i) Structured team briefing
The PCTS programme was based on observational studies showing that medical teams can engage in the rapid identification and review of potential adverse events. 7,[16][17][18][19] We designed a daily interdisciplinary briefing for structured team self- interventions advise a degree of local flexibility: 23 other teams were able to make minor changes to their unit's pro forma, whilst retaining the overall format. This flexibility maintains the 'hard core' of the intervention, and permits a 'softer periphery' to maximise its uptake. Briefings ended with a team agreement on how to resolve, or escalate, the concerns that had been discussed. After a short period of  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59

ii) Facilitation
The second component of the intervention was facilitation, advancing the issues raised in the HEADS-UP briefings to bring about tangible unit-level and organisational changes. Facilitation helps to make sense of quality improvement interventions, aligning them with their participants and the surrounding context. 25 It is necessarily opportunistic and malleable, taking advantage of existing organisational levers. Here, it involved (for example) working with frontline teams to identify and document their areas of concern more effectively; championing those concerns in regular meetings with service leaders and safety committees; and following up on subsequent agreed actions when other priorities threatened their resolution. Other studies that have used a facilitator in a similar way have described the role as an 'animateur', bringing on board people over whom he has no direct managerial authority. 26,27

iii) Feedback
The third component of the intervention was feedback to participating teams, managers, governance committees, and senior executives. Feedback summarised and disseminated the information collected in the daily briefings, highlighting HEADS-UP performance in each area, common concerns and challenges, and recurrent or unresolved problems that might require additional support. HEADS-UP data were provided on request to service leads, to support their business planning.
Although described separately here, we hypothesised that the intervention components would be interlinked, in that changes arising from the programme's facilitation and feedback would motivate increasing engagement with HEADS-UP briefings, in turn increasing their ability to bring about change. We anticipated that improvements in interdisciplinary team care effectiveness would be brought about by parallel improvements in ward teams' function, and incremental support service improvement in response to their concerns.

Setting
Characteristics of the two participating institutions are provided in Table 1. Most study wards were in a community general hospital in London. Both institutions faced significant challenges during the study period, with significant turnover in senior staff, mounting service pressures and financial restrictions. One hospital was in the process of a merger, and underwent a major inspection by the healthcare regulator (the Care Quality Commission). The other hospital was introducing an electronic health record.

Study outcomes
The primary outcome was excess length of stay (eLOS). eLOS was a binary variable, representing an admission more than 24 hours above the patient's expected length of stay. Benchmarks for expected length of stay were generated using patient-level Healthcare Resource Groups 29 data from a network of four local hospitals. These hospitals were subject to the same community service restrictions and healthcare economy demands as the study sites. The study design therefore evaluated the extent to which intervention and control wards met this local standard. baseline and six months into the intervention period, using the relevant subscales from the Safety Attitudes Questionnaire. 30,31 Secondary outcomes (and the per protocol analysis described below) were pre-specified.

Data collection
Anonymised patient-level and ward-level outcomes were extracted from routinelycollected data sets. Potential participants in the HEADS-UP briefings were invited to complete anonymous surveys at baseline and six months later, either by submitting responses electronically (via Survey Monkey) or via a paper questionnaire. Overall response rates were calculated using contemporaneous staffing rosters.

Sample size
Power calculations were conducted for the primary outcome (eLOS), using a recommended methodology for stepped wedge trials. 32 As described in detail in the published protocol, 9 we adopted an intraclass correlation coefficient (ICC) of 0.06, based on the ICCs for length of stay and appropriateness of stay in trials of acute care pathways. 33 We iterated calculations based on the wards with the highest and lowest baseline eLOS rates. 9 With a two-sided p<0.05, a sample size of 7840 patients was needed to detect a 2-14% absolute risk reduction with a power between 75% and 100%. 9 This approach to sample size calculations and study power has been used for other stepped wedge trials. 34

Allocation and blinding
Individual patients were not recruited separately to the study, so we did not anticipate significant bias due to lack of allocation concealment. 35 Staff could not be blinded to their ward's assignment, due to the nature of the intervention. Clinical outcome data and escalation of care data were extracted by local administrative staff blinded to the study, as part of their ordinary duties. Hospital peer group data were generated by the data extraction services CHKS (Alcester, UK) and Dr Foster (London, UK), also blinded to intervention group.

Statistical analysis
Multi-level mixed-effects models (Stata, v14.2) were used to evaluate the intervention's effect on each patient-and ward-level outcome. This statistical approach accounts for clustering of outcomes within wards, and repeated measurements over time, representing ward-level variance in patient outcomes as a random effect. For the binary patient-level outcomes, binary logistic models were used; for counts data (complications of care and processes of care), we used Poisson loglinear models. Analyses involving eLOS were restricted to those patients who survived to discharge; no patients were excluded because of 'outlier' length of stay.
General linear regression models were used to analyse survey data, using a difference-in-differences approach 36 with an interaction code for time*intervention at each site (SPSS, v22). This method evaluated whether changes in survey responses over time differed between PCTS participants and non-participants.
In addition to the intention-to-treat analysis, we conducted a per protocol analysis, evaluating the effect of briefing implementation fidelity. Based on observations during the pilot period, we described implementation fidelity each month as the product of briefing frequency and engagement. High fidelity required both high briefing frequency and correspondingly intensive documentation by the ward. High frequency was coded if briefings took place on ≥75% working days that month.
Monthly engagement was defined as high if teams documented more than the median number of issues. An interaction term 'frequency*engagement' was incorporated into each model, with a code for high fidelity where both engagement and frequency were high. Note that this per protocol analysis, which calculated different implementation fidelity codes for each ward-month, allowed for varying fidelity by a single ward team over the course of the study.
Patient-level outcomes were adjusted for time effects, age, Charlson comorbidity index, 37 and palliation status, as well as ward admissions. Time effects were specified This decision was reviewed at the authors' request, and the study was registeredprior to completion of data collection -in the ISRCTN registry (https://dx.doi.org/10.1186/ISRCTN34806867). 9 Staff were aware that the service development was being formally evaluated. As is routine for this type of intervention, we did not seek participant-level consent.  Table 2].

Implementation fidelity
Briefing implementation data were available for 71/73 (97.2%) ward months. There was variation in implementation fidelity, both in terms of briefing frequency (median 80% working days/month, interquartile range 65%-90%), and engagement (median confirmed the effect sizes of both intention-to-treat and per protocol analyses, and their statistical significance.
An exploratory plot of the modelled relationship between engagement, briefing frequency and the probability of eLOS highlighted different outcomes with high fidelity implementation and lower fidelity implementation [ Figure 2]. The plot models engagement and briefing frequency as continuous variables within a multilevel model to help visualise their interaction.