Article Text

Original research
Mixed methods evaluation of the Getting it Right First Time programme in elective orthopaedic surgery in England: an analysis from the National Joint Registry and Hospital Episode Statistics
  1. Helen Barratt1,
  2. Andrew Hutchings2,
  3. Elena Pizzo1,
  4. Fiona Aspinal1,
  5. Sarah Jasim3,
  6. Rafael Gafoor1,
  7. Jean Ledger1,
  8. Raj Mehta1,
  9. James Mason4,
  10. Peter Martin1,
  11. Naomi J Fulop1,
  12. Stephen Morris5,
  13. Rosalind Raine1
  1. 1Department of Applied Health Research, University College London, London, UK
  2. 2Department of Health Services Research and Policy, London School of Hygiene and Tropical Medicine, London, UK
  3. 3Care Policy & Evaluation Centre, London School of Economics and Political Science, London, UK
  4. 4Warwick Medical School, University of Warwick, Coventry, UK
  5. 5Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
  1. Correspondence to Dr Andrew Hutchings; andrew.hutchings{at}


Objective To evaluate the impact of the ‘Getting it Right First Time’ (GIRFT) national improvement programme in orthopaedics, which started in 2012.

Design Mixed-methods study comprising statistical analysis of linked national datasets (National Joint Registry; Hospital Episode Statistics; Patient-Reported Outcomes); economic analysis and qualitative case studies in six National Health Service (NHS) Trusts.

Setting NHS elective orthopaedic surgery in England.

Participants 736 088 patients who underwent primary hip or knee replacement at 126 NHS Trusts between 1 April 2009 and 31 March 2018, plus 50 NHS staff.

Intervention Improvement bundle including ‘deep dive’ visits by senior clinician to NHS Trusts, informed by bespoke set of routine performance data, to discuss how improvements could be made locally.

Main outcome measures Number of procedures conducted by low volume surgeons; use of uncemented hip implants in patients >65; arthroscopy in year prior to knee replacement; hospital length of stay; emergency readmissions within 30 days; revision surgery within 1 year; health-related quality of life and functional status.

Results National trends demonstrated substantial improvements beginning prior to GIRFT. Between 2012 and 2018, there were reductions in procedures by low volume surgeons (ORs (95% CI) hips 0.58 (0.53 to 0.63), knees 0.77 (0.72 to 0.83)); uncemented hip prostheses in >65 s (OR 0.56 (0.51 to 0.61)); knee arthroscopies before surgery (OR 0.48 (0.41 to 0.56)) and mean length of stay (hips −0.90 (−1.00 to -0.81), knees −0.74 days (−0.82 to −0.66)). The additional impact of visits was mixed and comprised an overall economic saving of £431 848 between 2012 and 2018, but this was offset by the costs of the visits. Staff reported that GIRFT’s influence ranged from procurement changes to improved regional collaboration.

Conclusion Nationally, we found substantial improvements in care, but the specific contribution of GIRFT cannot be reliably estimated due to other concurrent initiatives. Our approach enabled additional analysis of the discrete impact of GIRFT visits.

  • hip
  • knee
  • surgery

Data availability statement

Data are available on reasonable request. In line with the data sharing agreements between University College London and NHS Digital/ HQIP, aggregate small number suppressed outputs for the study period (1 April 2009–31 March 2018) are available on request from the corresponding author, AH (

This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • We report the first, independent evaluation of the high profile Getting It Right First Time (GIRFT) National Health Service improvement programme.

  • Our mixed-method approach enabled us to provide a comprehensive and robust understanding of GIRFT, exploring the impact of the programme from different perspectives.

  • Our linked dataset allowed us to examine a range of measures, as well as estimating the specific contribution of Trust visits, whilst the case study analysis provided further insights from the perspective of Trust staff.

  • We could not examine some key GIRFT target measures (eg, procurement, litigation rates) because appropriate data were not available.

  • We also could not capture costs incurred by Trusts, because activities were not consistently tracked.


‘Getting it Right First Time’ (GIRFT) is one of the largest improvement programmes in the National Health Service (NHS). It began in orthopaedics in 2012 with the publication of the first GIRFT report, recommending changes to improve outcomes and reduce costs.1 Following government investment totalling £62.8m, GIRFT now operates in 44 different specialties or clinical workstreams.2 GIRFT is an improvement ‘bundle’—a small number of interventions performed together to improve care.3 Clinical leadership is fundamental: GIRFT was established by a senior surgeon, who leads the orthopaedic workstream and now chairs the wider programme. Each workstream is chaired by a clinical lead from the relevant specialty. The programme includes components operating at local (ie, NHS provider Trusts) and national (ie, across England) levels. Local components include ‘deep dive’ visits to Trusts, while national components include national reports describing how unwarranted variations can be addressed. Initial first visits were piloted at a small number of Trusts in 2013, then replicated across the country in orthopaedics and other workstreams from 2015. Before each visit, Trusts are sent a bespoke ‘datapack,’ collated by GIRFT, describing their performance on over 100 variables, drawn from sources including Hospital Episode Statistics (HES) and national audits, such as the National Joint Registry (NJR). Data include use of evidence-based procedures and costs. Data packs bring data sources together and make them accessible, facilitating comparison to national and peer group averages. Discussion at the meeting is driven by the datapack and tailored to the Trust. Attendees, comprising clinicians, managers and other relevant professionals, identify where and how improvements could be made.Revisits follow a similar format with Trusts reporting changes made since the first visit.

There are few examples of initiatives on this scale, and consequently few evaluations. The American College of Surgeons National Surgical Quality Improvement Programme (NSQIP) provides data to drive improvement. However, only around 10% of US hospitalsparticipate.4 In the UK, the national Perioperative Quality Improvement Programme will collate and feedback data about surgical complications.5 NHS Trusts are also subject to inspections by the Care Quality Commission, while some National Clinical Audits also provide written feedback.6 However, GIRFT clinical leads visit all Trusts to discuss theirdatapack,as well asopportunities for improvement. GIRFT reflects evidence that measurement alone is insufficient.7 The effectiveness of feedback is dependent on hard’ and ‘soft’8 incentives, including using data to support change and holding participants to account at revisits.9 However, although GIRFT visits Trusts twice, sometimes more, the largest improvements in NSQIP were seen where providers had participated for many years.10 Improvements were also more marked at siteswith pre-existing internal improvement programmes.11 GIRFT makes use of system (top-down) leadership, but designatedlocal (bottom-up) leadership within Trusts may also be important.12 Despite the limited evidence base, GIRFT has been expanded across the NHS, in the absence of independent evaluation.

We, therefore, evaluated the effectiveness and cost-effectiveness of the GIRFT orthopaedic workstream, focusing on the most common elective procedures, primary hip and knee replacement.13 The few evaluations of national programmes focus mainly on quantitative process and outcome measures, and hence may miss wider effects. Consequently, we undertook a multicomponent analysis. The first described national trends in processes and outcomes over time, starting before GIRFT. However, this cannot disentangle the relative impact of GIRFT from other concurrent national initiatives that targeted similar measures. For example, Lord Carter’s review of NHS hospital efficiency took place between June 2014 and February 2016, and made recommendations about tackling ‘unwarranted variation’ in key resource areas such as staffing, diagnostics and procurement. NHS Right Care originated as part of the Quality, Innovation, Productivity and Prevention programme in 2009 and is still ongoing. It supports local economies to improve population healthcare and address performance variations. Our second analysis focused on a specific component of GIRFT, first visits to Trusts, to elucidate their additional local impact over and above the national trends. We also undertook an economic analysis to evaluate the cost impacts of visits, and a qualitative exploration of the impact of GIRFT from the perspective of Trust staff.



We employed a mixed-methods approach, using both quantitative and qualitative data to consider the various impacts of GIRFT from different perspectives and at different levels (national and Trust). In the quantitative analysis, we examined eight key GIRFT target measures. We first describe changes in the seatnational level. We then exploit variation in the timing of GIRFT visits to assess the additional impact of initiating local involvement after allowing for national trends. We also undertook an economic analysis to assess costs and savings attributable to the visits. Finally, we used qualitative methods to explore how staff perceived that GIRFT had impacted practice at six case study sites.

Quantitative methods

Quantitative data sources

We used linked data from NJR, HES and patient-reported outcome measures (PROMs) to identify patients aged 18 or over whounderwent an elective primary hip or knee replacement between 1 April 2009 and 31 March 2018, at 126 English Trusts thatreceived a first GIRFT visit by November 2017. We excluded eight Trusts visited later because they would contribute no postvisit data to the analysis. We included primary procedures eligible for the PROMs programme.14 Metal-on-metal implants were excluded because they were subject to specific regulatory measures.15


We identified eight measures that were GIRFT priorities and available from the dataset. We were unable to use data on some priority indicators, such as litigation and procurement, because they are not publicly available. Process measures were: (1) procedures conducted by low volume surgeons ≤35 similar procedures or ≤10 unicondylar/patella-femoral knee replacements per year); (2) use of uncemented hip implants in patients >65; (3) arthroscopy in year prior to knee replacement. GIRFT sought to reduce these, citing evidence that higher surgeon volumes equate to better outcomes; knee arthroscopy is not effective; and uncemented hip implants in older patients have higher revision rates.1 There is no commonly accepted threshold for surgeon volume. We therefore used the thresholds ≤35 similar procedures or ≤10 unicondylar/ patella-femoral knee replacements per year based on clinical input, the literature on ‘low volume’ surgical thresholds in orthopaedic surgery, discussion with the GIRFT programme team and recommendations in their 2015 programme report.2 Outcome measures comprised: (1) hospital length of stay for index procedure; (2) emergency readmissions within 30 days; (3) revision surgery within 1 year; and in the subset of patients who participated in the national PROMs programme (4) health-related quality of life (EQ-5D)16 and (5) functional status (Oxford Hip/ Knee Score (OHKS)).17 GIRFT sought to minimise the first three to reduce associated costs. We included EQ5D and OHKS to facilitate economic evaluation. We identified revisions from NJR and HES,18 measuring patient-reported outcomes at 6 months.14 We adjusted for: age in years; ethnicity (white, not white, or missing/not reported); sex; quintile of Index of Multiple Deprivation; Charlson comorbidity19 (HES) and American Society of Anesthesiologists grade (NJR).

Statistical analyses

We examined proportions and means of measures before GIRFT started (2009/2010–2011/2012), and in the final 2 years of the study (2016/2017–2017/2018). In our analysis of national trends, we also estimated year-on-year changes in comparison with 2012/13, the year GIRFT began, using casemix-adjusted hierarchical logistic and linear regression.

We analysed the additional impact of initiating visits using a pre–post design. First visit dates defined preperiods and postperiods for Trusts. The first visit marked the start of GIRFT’s local involvement with an individual Trust. Prior to this, a Trust would be aware of GIRFT’s national work, but would not have received the intervention tailored to their individual circumstances. At the start of the programme, the GIRFT team contacted all eligible Trusts, and the timetable of the first visits was based on the order in which sites responded, rather than, for example, orthopaedic performance data. Nevertheless, the order of visits was not random, so we divided Trusts into early, middle and late groups in a 1:2:1 ratio, to examine whether changes differed by visit timing. The ratio was specified a priori to split Trusts by lower and upper quartiles, ensuring a minimum of 30 per group. We used hierarchical logistic and linear regression models adjusted for casemix variables with clustering at Trust level to estimate changes in levels for each measure. We controlled for temporal trends using fractional polynomials, and pre-operative scores in PROMs analyses. The impact in early, middle and late groups were estimated using interactions between group and the change in levels. The only casemix variables with missing data were ethnicity (3.8% and 2.7% of THR and TKR patients, respectively) and Index of Multiple Deprivation (table 1). We used a complete-case analysis but included a missing category for ethnicity. Cases with a missing deprivation score represented only 0.3% of the analysis sample and were excluded from the complete-case analysis. We allowed for implementation delays by excluding procedures in the 3 months after each initial Trust visit, based on a GIRFT case study indicating that improvements occurred within 3 months.20 We used a longer 15-month exclusion when analysing arthroscopies, because these were measured in the 12 months before surgery and hence might overlap the visit date.

Table 1

Characteristics of patient population*

Economic analysis methods

Conventionally, economic evaluation compares costs and outcomes of an intervention and comparator. However, it was not possible to use a comparator because GIRFT visited Trusts at different times. Instead, we compared the impact of visitsat Trust level with expected costs in the absence of GIRFT. We examine: (1) cost of the visits; (2) costs incurred by Trusts to implement recommendations; and (3) costs or savings resulting from the visits in the limited measures publicly available for analysis (figure 1).

Figure 1

The economic components of GIRFT orthopaedic evaluation. GIRFT, Getting it Right First Time; HES, Hospital Episode Statistics; HRGs, Healthcare Resource Groups; NHS, National Health Service; NICE, National Institute for Health and Care Excellence; PROMs, patient-reported outcomes measures; QALYs, quality-adjusted life-years.

Information from the GIRFT programme team enabled us to calculate visit costs at Trust level. We collected data on costs incurred by Trusts via a national survey, distributed electronically on our behalf, in early 2018, by the programme team, to GIRFT ‘champions’ in each Trust. It included questions about five GIRFT recommendations that could have an economic impact: implementation of ring-fenced beds;introduction of extended physiotherapy services; changes in use of theatre loan kits; reductions in activity by low volume surgeons; and improvements to theatre efficiency.

We assessed the economic impact of changes using the results of the Trust-level analysis above, quantified using NHS cost data.21 We used the marginal effects estimated in the statistical models to estimate the economic impact of the visits where a statistically significant change was observed.

Qualitative methods

As outlined in the study protocol,13 our evaluation included several qualitative elements, including interviews with the GIRFT programme team and national health leaders (to understand the development of GIRFT and its evolution over time), as well as focus groups with patients (to understand their views about the content of the GIRFT programme). As these elements did directly contribute to our assessment of the impact of GIRFT, their findings will be reported elsewhere. We used a case study approach to explore the implementation of GIRFT in individual NHS Trusts, including staff perceptions about whether and how GIRFT impacted practice locally.22

Case study sites

We purposively sampled six Trusts in England, representing a spread of hospital types (district general and teaching hospitals) and setting (region and rural vs urban).

Qualitative data collection

We have described our data collection methods previously.13 Here, we report data collected via semistructured interviews with staff at the six case study sites, between October 2016 and May 2019 (online supplemental appendix 1). The interview topic guide was developed with input from the evaluation team (eg, to incorporate questions about resource costs and implementation) and informed by scoping discussions conducted with the team delivering GIRFT, to understand the programme components (see online supplemental appendix 2). It was piloted prior to the interviews, and then refined iteratively as the study progressed, to take account of emerging findings. Interviewees included surgeons and other staff present at the first GIRFT visit or knowledgeable about local improvement. Interviews lasted between 20 and 60 min and were audiorecorded forfull transcription.

Qualitative data analysis

We analysed data thematically, within and across cases, combining inductive and deductive (informed by the aims of the GIRFT programme) approaches.23

Patient and public involvement

From the inception of the study, we worked with the NIHR CLAHRC North Thames lay Research Advisory Panel to to refine the protocol and ensure that the proposed research appropriately reflected the priorities, experience, and preferences of patients. Through this, we identified a patient representative (RM) who agreed to join the study steering board. He has subsequently played a collaborative role in refining the research design, interpreting findings and disseminating results. The research reported here did not directly involve patients, so RM did not play a role in recruitment to the study.


Quantitative findings

Study population

A total of 337 279 patients who underwent a hip replacement and 398 809 who received a knee replacementin 126 NHS Trusts, between 1 April 2009 and 31 March 2018 (figures 2 and 3) (see table 1 for patient characteristics).

Figure 2

Study population—total hip replacement patients. Patient-reported outcomes measures (PROMs) data collected from April 2009 for consenting patients only. GIRFT, Getting it Right First Time; NHS, National Health Service.

Figure 3

Study population—total knee replacement patients. Patient-reported outcomes measures (PROMs) data collected from April 2009 for consenting patients only. GIRFT, Getting it Right First Time; HES, Hospital Episode Statistics; NHS, National Health Service; NJR, National Joint Registry.

National trends

Process measures

Nationally, there were substantial improvements in process measures,often beginning before 2012 when GIRFT (figures 4 and 5). Comparing 2009–2012 and 2015–2018, reductions varied from 29% fewer uncemented hips to a 58% reduction in knee arthroscopy (table 2). Reductions in procedures by low volume surgeons began prior to GIRFT for hips and knees. OR for 2017/2018 vs 2012/2013 were 0.58 (95% CI 0.53 to 0.63) for hips and 0.77 (95% CI 0.72 to 0.83) for knees. Reductions in uncemented hips and knee arthroscopy also began prior to 2012, but the largest occurred as GIRFT progressed with ORs for 2017/2018 vs 2012/2013 of 0.56 (95% CI 0.51 to 0.61) and 0.48 (95% CI 0.41 to 0.56), respectively.

Figure 4

Trends in process and outcome measures for primary hip replacements (2009/2010–2017/2018). OHKS, Oxford Hip/Knee Score.

Figure 5

Trends in process and outcome measures for primary knee replacements (2009/2010–2017/2018). OHKS, Oxford Hip/Knee Score.

Table 2

Unadjusted process and outcome measures across the study population at the beginning and end of the study period, 1 April 2009–31 March 2018

Outcome measures

Mean length of stay reduced by just over 1 day for both hip and knee patients between 2009–2012 and 2015–2018. The largest reductions began before GIRFT (figures 4 and 5). Mean differences between 2012/2013 and 2017/2018 were −0.90 days (95% CI −1.00 to –0.81) for hips and −0.77 days (95% CI −0.82 to –0.66) for knees.There was some evidence that 1 year knee revisions declined by 2017/18, but little evidence of a change in hips. Postoperative quality of life and functional status improved, but there was little change in emergency readmissions (table 2 and figures 4 and 5).

Additional effect of initiating ‘deep dive’ involvement at individual trusts

GIRFT’s first visits to Trusts occurred between September 2013 and November 2017 with half occurring between February and July 2014 (table 3).

Table 3

Changes in process and outcome measures after first GIRFT visit by trust cohort*

Process measures

The additional effects of the visits on process measures, after controlling for national trends, differed between the earliest Trusts visited and the middle and later groups. In the earliest group, we observedreductions in procedures by low volume surgeons. Conversely, this increased in the middle and late groups, but use of uncemented hipsreduced (table 3).

Outcome measures

Following visits, mean length of stay increased for the middle and late groups, but decreased in hip patients in the earlier group. However, estimated effects were small in comparison with national trends. We found limited evidence of an impact on other outcomes (table 3).

Economic analysis

The estimated cost to deliver the Trust visits was £491 420 (component 1). This is a ‘sunk’ cost, including £10 769 transport costs (total return trip mileage); £150 000 for datapacks; £111,916 GIRFT staff time; and £106 820 opportunity cost of Trust staff time attending the meetings.

Our national survey provided mixed reports about the implementation of recommendations so it was not possible to accurately estimate costs. Therefore, although Trusts incurred costs, these could not be included (component 2).

Although length of stay increased after the visits (p<0.05), this was not included in our economic analysis. Trusts receive the same payment for each inpatient stay, up to the tariff ‘trim point’ (currently 9 days). However, average length of stay remained below this,despite the observed increase, so there was no extra monetary cost. Nevertheless, increased length of stay does come with an opportunity cost, as the bed is not available for another patient. As there was no change in EQ-5D, used to calculate quality-adjusted life-years, we were unable to undertake a cost-effectiveness analysis. Instead, we conducted a cost–consequences analysis. We observed positive and negative changes after GIRFT visits (table 3) and additional costs partly outweighed savings (table 4). The overall impact of the statistically significant changes, for the limited number of measurable variables,not including length of stay, was a saving of £431 848 (component 3).

Table 4

Summary of GIRFT economic impact following first GIRFT visit by trust cohort (2019/2020 GBP)

Case-study analysis

In this paper, our goal is to evaluate the effectiveness and cost-effectiveness of the GIRFT programme, focusing on processes and outcomes of care. We; therefore, restrict our reporting here to the qualitative data sources where the impact of GIRFT on patient care was explored: interviews conducted at our six case study sites. We conducted 50 interviews across six sites (online supplemental appendix 1). Interviewees described five types of impact, operating at three levels:individual, Trust and regional (figure 6). These ranged from changes to implant selection,to improved networking within Trusts and across regions. However, GIRFT particularly impacted ways of working at Trust level, for example, catalysing planned improvements.

Figure 6

Summary of impacts attributed to GIRFT orthopaedic visits at six case-study site. GIRFT, Getting it Right First Time.

Interaction within and between the three levels of impact was key. For example, reducing low volume surgery depended on regional partnersforming networks. Similarly, increasing the number of procedures conducted per day required within-organisation negotiation to access theatre time. Visits were most successful when GIRFT aligned with Trust priorities(eg, rationalising procurement—site 3). They were less effective where GIRFT measures were not a local priority because of more immediate demands, for example, financial pressures

A range of factors, in addition to GIRFT, impacted practice. These included the concurrentroll out of other initiatives. For example, the Carter Review impacted procurement (sites 2 and 6), while reductions in arthroscopies were driven by the Payment by Results Assurance Framework (site 2).


This is the first independent evaluation of GIRFT. We found substantial improvements in orthopaedic care during the first 6 years of the programme, notably reductions in uncemented hip prostheses, knee arthroscopies and length of stay. However,these started before GIRFT. It was not possible to estimate the distinct contribution of the programme, because of other concurrent initiatives with common goals (eg, carter review). We also estimated the additional impact of GIRFT visits. We found a mix of positive and negative effects, generally small compared tooverall improvements and differing between the earliest and latest Trusts visited. It is important to note, though, that the eight measures we analysed in our quantitative analyses, and targeted by GIRFT, relate to direct patient care. Staff at our case study sites reported that the programme had had an impact, but the effects that they described related much more to ways of working at Trust level (eg, improved networking) rather than direct patient care.

Our mixed-method approach has enabled us to provide a comprehensive and robust understanding of GIRFT, using quantitative and qualitative data to explore the impact of the programme from different perspectives. Previous similar evaluations tend to focus on quantitative analyses. Our comprehensive linked dataset allowed us to examine a range of measures, as well as estimating the specific contribution of Trust visits. The only variables with missing data were ethnicity (3.8% and 2.7% of THR and TKR patients, respectively), for which we included a ‘missing’ category, and Index of Multiple Deprivation. We excluded the latter from the complete-case analysis, but as this represented only 0.3% of the analysis sample, it is unlikely to have had a major impact on our findings. The case study analysis provided further insights from the perspective of Trust staff. However, we measured changes at 3 months postvisit, and although some made improvements within this window, it may not have been sufficient for others. A further limitation is that we could not examine other key target measures. Procurement data were only available for 2017–2019 and could not be compared with previous years. Although litigation data are available, our previous work24 demonstrates a significant lag from incident to resolution. Consequently, it would not be possible to determine whether changes were an impact of GIRFT or other policies (eg, Sign up to Safety). Other outcomes, such as 5-year and 10-year revisions, were beyond our time frame, as were impacts after 2018.We also could not capture costs incurred by Trusts, because activities were not consistently tracked. The increased net cost associated with the programme is therefore a conservative estimate. Finally, our qualitative interviews took place several months after the first GIRFT visits, creating a risk of recall bias.25

In 2020, the GIRFT team published an internal evaluation20 describing how the orthopaedic workstream had supported Trusts to release ‘financial opportunities’ of £696 million. Our findings differ for a number of reasons. First the internal evaluation was limited to descriptive analyses of national trends between 2013/2014 and 2018/2019. This is broadly consistent with our national trend analysis, although we adjusted for casemix and included data from 2009 to explore changes prior to GIRFT. In contrast, the GIRFT team attributed all trend changes to the programme. Second, they limited their economic analysis to the impact on processes and outcomes, whereas we also examined the cost of the visits and costs incurred by Trusts. Finally, our diverse case study sites facilitated cross-case comparison, to create a detailed contextual picture. The internal evaluation includes individual case studies which exemplify success. Early narrative reviews of GIRFT were published by the Kings Fund26 and NHS Providers27 However, these are not formal evaluations.

Our finding of improvements in processes but less clear changes in outcomes is consistent with evidence that improvement initiatives generally have greater impact on processes of care than patient outcomes.28 Improvements observed before 2012 may be explained by GIRFT identifying existing best practice to share more widely.The additional impact of visits was mixed, with no consistent pattern across the cohorts. In some cases, performance worsened immediately afterwards. There may have been underlying differences between Trusts visited earlier and later, but visits were just one part of the programme and we are aware that other components of GIRFT, such as national reports, had impacted the care provided at case study sites. One further possible explanation is that later cohorts had made changes prior to their visit because of information they gleaned from Trusts visited earlier in the process. Although some of the Trusts were familiar with the overall recommendations being made by GIRFT, this could equally have been because they were also priorities for other national programmes being rolled out at the same time. As data collection at the case study sites took place before the quantitative analyses, and therefore, we did not know about the variations in individual process and outcome measures at the time the interviews took place, participants were not directly asked about them. It would also have been a challenge to draw firm conclusions about the causes of these differences from just six sites. However, our findings reflect other literature illustrating the challenges for improvement programmes in outperforming the secular trend, including the possibility that the programmes themselves may be implicated in that trend.29

GIRFT is one of the largest improvement initiatives in the NHS. Our analysis demonstrates significant improvements in orthopaedic care, which began prior to GIRFT. Changes observed over the past 10 years are likely attributable to both GIRFT and other concurrent initiatives, but we cannot determine the relative contributions of these. The additional impact of visits was mixed. Given the substantial cost and expansion of the programme, ongoing monitoring and access to additional relevant data, including details of Trust activities, as well as early engagement with rigorous evaluation design (eg, stepped wedge approaches), are recommended to enhance the ability to hold GIRFT and other national improvement progammes to account.

Data availability statement

Data are available on reasonable request. In line with the data sharing agreements between University College London and NHS Digital/ HQIP, aggregate small number suppressed outputs for the study period (1 April 2009–31 March 2018) are available on request from the corresponding author, AH (

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants and was approved by NHS North West - Liverpool East Research Ethics Committee (16/NW/0654). Participants gave informed consent to participate in the study before taking part.


We thank Patricia Hallam for administrative support for the study. We thank the patients and staff of all the hospitals in England, Wales and Northern Ireland who have contributed data to the National Joint Registry. We are grateful to the Healthcare Quality Improvement Partnership (HQIP), the NJR Research Committee and staff at the NJR Centre for facilitating this work. The authors have conformed to the NJR’s standard protocol for data access and publication.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • HB and AH are joint first authors.

  • Contributors HB was principal investigator. HB and RR initiated the research. HB, AH, EP, NJF, SM and RR designed the evaluation. SJ, AH and RG created the linked dataset. RG and AH conducted the quantitative analyses; PM provided statistical advice. SJ and EP designed the Trust survey and analysed the resultant data. EP and SM led the economic analysis. SJ and JL were responsible for qualitative data collection; FA, JL, SJ, HB and NJF analysed the qualitative data. HB, AH, EP, FA and RR drafted the manuscript and SJ, RG, JL, RM, JM, PM, NJF and SM contributed to reviewing and revision. HB, AH, EP, FA, SJ, RG, JL, RM, JM, PM, NJF, SM and RR approved the final version. HB, AH, EP, FA, SJ, RG, JL, RM, JM, PM, NJF, SM and RR had full access to all the data in the study and accept responsibility to submit for publication. HB will act as guarantor.

  • Funding This report is independent research funded by the National Institute for Health Research ARC North Thames (award/ grant number is not applicable).

  • Disclaimer The views expressed represent those of the authors and do not necessarily reflect those of the National Joint Registry Steering Committee or the Healthcare Quality Improvement Partnership (HQIP) who do not vouch for how the information is presented.

  • Competing interests All authors have completed the ICMJE uniform disclosure form at coi_disclosure.pdf and declare: HB, AH, EP, FA, SJ, RG, JL and RR are funded in full or in part by the National Institute for Health Research ARC North Thames. RR and NJF are NIHR Senior Investigators; no financial relationships with any organisations that might have an interest in the submitted work in the previous 3 years; no other relationships or activities that could appear to have influenced the submitted work.

  • Patient and public involvement Patients and/or the public were involved in the design, or conduct, or reporting, or dissemination plans of this research. Refer to the Methods section for further details.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.