Article Text


Deriving literature-based benchmarks for surgical complications in high-income countries: a protocol for a systematic review and meta-analysis
  1. Mary E Brindle1,
  2. Derek J Roberts2,
  3. Oluwatomilayo Daodu1,
  4. Alex Bernard Haynes3,4,
  5. Christy Cauley3,
  6. Elijah Dixon5,
  7. Claude La Flamme6,
  8. Paul Bain7,
  9. William Berry4
  1. 1 Department of Surgery, University of Calgary, Calgary, Canada
  2. 2 Departments of Surgery and Community Health Sciences, University of Calgary and the Foothills Medical Centre, Calgary, Canada
  3. 3 Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts, USA
  4. 4 Ariadne Labs, Boston, Massachusetts, USA
  5. 5 Department of Surgery, University of Calgary and the Foothills Medical Centre, Calgary, Canada
  6. 6 Sunnybrook Health Sciences Centre, Toronto, Canada
  7. 7 Department of Countway Library, Harvard Medical School, Boston, Massachusetts, USA
  1. Correspondence to Dr Mary E Brindle; maryebrindle{at}


Introduction To improve surgical safety, health systems must identify preventable adverse outcomes and measure changes in these outcomes in response to quality improvement initiatives. This requires understanding of the scope and limitations of available population-level data. To derive literature-based summary estimates of benchmarks of care, we will systematically review and meta-analyse rates of postoperative complications associated with several common and/or high-risk operations performed in five high-income countries (HICs).

Methods and analysis An electronic search of PubMed, Embase, Web of Science, Cochrane Central, the NHS Economic Evaluations Database and Health Technology Assessment database will be performed to identify studies reviewing national surgical complication rates between 2000 and 2016. Two reviewers will screen titles and abstracts and full texts of potentially relevant studies to determine eligibility for inclusion in the systematic review. We will include English-language publications using data from health databases in the USA, Canada, the UK, Australia and New Zealand. We will include studies of patients who underwent hip or knee arthoplasty, appendectomy, cholecystectomy, oesophagectomy, abdominal aortic aneurysm repair, aortic valve replacement or coronary artery bypass graft. Outcomes will include mortality, length of hospital stay, pulmonary embolism, pneumonia, sepsis or septic shock, reoperation, surgical site infection, wound dehiscence/disruption, blood transfusion, bile duct injury, stroke and myocardial infarction. We will calculate summary estimates of cumulative incidence, incidence rate, prevalence and occurrence rate of complications using DerSimonian and Laird random effects models. Heterogeneity in these estimates will be examined using subgroup analyses and meta-regression. We will correlate findings within contemporary clinical databases.

Ethics and dissemination This study of secondary data does not require ethics approval. It will be presented internationally and published in the peer-reviewed literature. Results will inform a future quality improvement tool and provide benchmarks of surgical complication rates within HICs.

Trial registration International Prospective Register of Systematic Reviews (PROSPERO). Registration number CRD42016037519.

Statistics from

Strengths and limitations of this study

  • The aims and objectives of the study have been established by a group of international stakeholders in surgical quality and safety representing relevant surgical specialties.

  • Results of this meta–analysis need to be interpreted alongside that of landmark trials.

  • Different databases use different methods of defining and measuring exposures and outcomes; these differences must be taken into account when drawing comparisons.

  • In order to best balance feasibility with comprehensiveness, some complications and operations that are important will not be captured by the study.

  • Direct comparisons between complication rates between two countries may not be feasible given the variability of the data.


Healthcare-associated complications contribute significantly to the national burden of morbidity and mortality in countries across the world.1 The absolute impact of complications related to medical care is often underappreciated as it is often not measured or fully reported.2 Complications occur because of both intrinsic risks and medical error. Despite the fact that the data have not been thoroughly explored, USA, the UK, Canada, Australia and New Zealand have all identified high rates of medical errors contributing to patient deaths and morbidity.3–7

Surgery plays a central role in the national burden of medical care-related morbidity.8 This is because surgery is a resource-intensive, often high-risk therapy within which there is ample opportunity for error. The majority of surgical complications are attributable to those procedures that are either extremely common with low complication or those that are less common and associated with a high rate of complications.9–11

To improve surgical safety nationally and internationally, health systems must identify preventable adverse outcomes. They must subsequently measure changes in these outcomes in response to quality improvement initiatives. These two requirements hinge on the availability of population-level data and an understanding of the scope and limitations of the data available.

Surgical outcomes can be compared between nations by focusing on a standard set of surgical procedures and a standard set of complication rates. The complications that are monitored and targeted for improvement must be clinically important, potentially preventable and reliably measured. Some important outcome measures like mortality and length of stay are objectively defined and reliably captured. Others, however, such as surgical site infection, are somewhat subjectively diagnosed, and present within a spectrum of severity.

Within the surgical literature, complication rates have been examined by multiple study designs, each of which has its benefits and limitations. Single-centre studies are commonly performed to examine rates of complications associated with specific procedures at individual hospitals or healthcare systems. These studies are typically question-driven and allow for collection of detailed and focused information about individual patients. The information generated from these studies can provide an understanding of the complex interaction between patient factors, individual treatments and outcomes. In addition, single-centre studies can be tailored to the setting where the study is performed and does not require a large, expensive data collection process.

Despite these advantages, when it comes to understanding surgical outcomes across an entire country, particularly over many years, studies from national databases offer information that smaller studies cannot provide. Population-level databases allow for greater generalisability by reporting large volumes of data derived from heterogeneous national patient populations. High-income countries (HICs) have multiple repositories in which healthcare data are captured and tracked over time. These repositories include administrative and other claims-based databases as well as clinical databases. Administrative databases rely on administrative coding and billings while clinical databases typically employ a nurse or clinically trained researcher to extract data directly from patient charts or enter data prospectively using predefined criteria. The populations within these data sets may represent a large, diverse population or may capture subsets of patients (children, military patient, oncology patients) within the country. The common purpose of all of these data sets is to provide a consistent, ongoing source of information that can be used for benchmarking healthcare quality and for assessing the impact of national events such as newly available or modified surgical procedures, devices, drugs, clinical practice guidelines, federal health policies, and national quality and safety initiatives.

The USA, Canada, the UK, Australia and New Zealand each have databases that capture national surgical outcomes using somewhat different means of collection. Despite their scope, large databases invariably represent only a sampling of the population, and data collected can be affected by multiple sources of bias. The flaws inherent in the actual databases require special consideration independent of the study performed. Examples of two databases in the USA that have different limitations are The Healthcare Cost and Utilization Project (HCUP) and the National Surgical Quality Improvement Program(NSQIP). The HCUP is an administrative discharge-based database with a broad scope but one that is subject to the limitations of a discharge-defined unit measure and of misclassification bias related to the International Classification of Diseases (ICD) coding used by data collection technicians to identify diagnoses and procedures. Coding errors within HCUP are common and may relate to providers’ misunderstanding of codes, missing diagnoses for codes unrelated to reimbursement or misleading coding used to maximise reimbursement.12 Conversely, NSQIP is a clinical database based in the USA, parts of Canada and in some individual international centres that uses rigorously trained nurse abstractors to extract data.12 Although the data from NSQIP are generally reliable, participation is limited to hospitals that are prepared to invest in this expensive database. Moreover, the scope of NSQIP is limited to prespecified data points collected on patients treated within nine subspecialties (general surgery, orthopaedics, gynaecology, plastic surgery, urology, cardiac surgery, neurosurgery, thoracic surgery and vascular surgery), with many participating centres only including some of these specialties.

Despite the fact that the USA, Canada, the UK, Australia and New Zealand all capture information on complications within national databases, there are limited studies that compare complications associated with common procedures within and across these countries. This systematic review and meta-analysis, therefore, aims to provide estimates of the current incidence, prevalence and occurrence rates of important postoperative complications after several common and/or high-risk surgical operations in five HICs (the USA, Canada, the UK, Australia and New Zealand). We will also examine complication estimates stratified by country and by patient population and procedure. Finally, during the process of obtaining these estimates, we will describe the current databases that provide estimates of population-level surgical outcomes in these countries, and explore the limitations of these database publications including correlation between published studies and contemporary information from national clinical databases.

Methods and analysis

Study design

This systematic review will be a contemporary synthesis of population-level intraoperative and postoperative complications in five HICS that use national databases to study complications after common and/or high-risk procedures. We will conduct the systematic review following the standards recommended in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses and the Meta-analysis of Observational Studies in Epidemiology statements.13 14

Systematic review clinical question

Our systematic review will attempt to answer the following clinical question, structured according to the Population, Intervention, Comparison, Outcomes and Design method of creating clinical questions:

  1. Population, Intervention and Comparison: Our population of interest will include patients who underwent one of nine common inpatient operations within the USA, Canada, the UK, Australia or New Zealand, any time including or after the year 2000. We will consider all patients regardless of age, gender or comorbidity. We have restricted our study population to those patients undergoing the most common inpatient operations performed in these five countries or those undergoing one of a subset of operations that are relatively common and/or at high risk of complications. We excluded caesarean section from our population given its unique population, risks and outcomes. To identify the most common non-caesarean section inpatient operations, we examined the published summary data provided by the major health databases of all five countries.15–19 Each country had slightly different lists of the most common inpatient operations. Despite this, there were 4 operations that were listed as one of the 10 most common operations in at least four of the five countries (hip and knee replacement, appendectomy and cholecystectomy) and therefore these procedures were selected for the most common non-caesarean section inpatient operations.15–19 The targeted relatively common high-risk surgeries were identified by the US national quality and safety watchdog Leapfrog.9 These high-risk procedures include abdominal aortic aneurysm (AAA) repair, aortic valve replacement, pancreatectomy and oesophagectomy. We also included coronary artery bypass grafting, which has been followed as a high-risk procedure by Leapfrog until 2012 and is one of the top 10 inpatient procedures in the USA and Australia. In total, we identified nine low-risk and high-risk procedures to examine in the systematic review (see table 1).

  2. Outcomes: The outcomes to be evaluated will include mortality, length of hospital stay, pulmonary embolism, pneumonia, surgical site infection, sepsis or septic shock, reoperation, wound dehiscence/disruption, common bile duct injury, blood transfusion, stroke and myocardial infarction. Mortality and length of stay were chosen as they are two important, objective outcomes measures. To identify the other complications, we compared the list of serious complication rates collected by NSQIP with those collected in the National Inpatient Survey and agreed on a common but not comprehensive list of nine outcomes (see table 1). The final complication list represented targets that are important to healthcare decision makers and front-line physicians and was generated with the aim to balance study feasibility with study comprehensiveness. In order to be pragmatic, outcome definition will be determined based on those used by the various databases, and this will be captured and reported. For example, for NSQIP, the definition of myocardial infarction is provided as ST elevation in two contiguous leads, a new left bundle branch, a new Q-wave in two or more contiguous leads or troponin elevation at least three times higher than the upper limit of the reference range in the setting of suspected infarction. The diagnosis of a myocardial infarction within administrative databases such as HCUP and Canadian Institute for Health Information is based on ICD coding of myocardial infarction assigned to individual patient discharges.

  3. Design: Only studies that use data from national medical databases will be considered for the analysis. This systematic review will consider both randomised controlled trials and cohort studies. We define cohort studies as studies that classify exposure at baseline and then follow patients forward to assess and calculate estimates of the frequency of occurrence on an outcome.20 This relatively new definition allows for single group studies.

Table 1

Target countries, databases, operations and complications for review

Search strategy

Three members with expertise in conducting systematic reviews (MEB, DJR, OD) and a medical librarian/information scientist (PB) developed an initial search strategy for the study. The search was developed with both Medical Subject Headings (MeSH) as well as title and abstract keywords to identify the specified nine procedures and nine complications as well as national health databases. The search was limited to English-language publications with data derived from national health databases of five large HICs (USA, Canada, the UK, Australia and New Zealand) will be identified for inclusion (table 1). The search was subsequently revised after we conducted a pilot electronic database search of PubMed, Embase, Web of Science, Cochrane Central, the NHS Economic Evaluations Database and Health Technology Assessment database between 2000 and 2016, which revealed 3111 citations (search strategy appendix 1). A sample of 50 of these citations underwent pilot screening by a single author (MEB) with 16 undergoing pilot extraction. Based on this screening, the search strategy was modified to include several expanded search terms and a revised search strategy performed on 13 June 2016 identified 3401 citations. The search has since been expanded to include readmission rates. These searches will be supplemented by searching the bibliographies of all included articles. The complete search strategy is attached (appendix 1).

Supplementary Material

Supplementary data

Study selection

The results of the electronic medical database searches will be screened independently and in duplicate to identify studies for inclusion in the synthesis. Screening will first be performed of titles and abstracts and then the full texts of potentially relevant articles. Articles will be included if they are (1) written in English; (2) report data from 2000 onwards; (3) if the population studied is from the USA, Canada, the UK, Australia or New Zealand; (4) if the study reports at least one of the outcomes of interest associated with one of the nine target procedures described earlier; and (5) the data are obtained from a national health-related database. We will exclude non-original publications (reviews) and those reporting animal data. We will also exclude studies that do not report sufficient data for the calculation of the cumulative incidence, incidence rate, prevalence or occurrence rate of complications (see below for definitions of these outcomes). The interobserver agreement on inclusion and exclusion between the two reviewers will be reported using Cohen’s κ statistic. Disagreements will be resolved by consensus or third party review where necessary.

Data extraction

Data on study design and outcomes will be extracted from all publications meeting inclusion criteria using a piloted data extraction form. This extraction will be done independently and in duplicate with results compared and re-evaluated in partnership. Data will be extracted from each study on: (1) publication details (first author, journal of publication, date of publication); (2) study design (time period of study, study location(s), database(s) used, type of study (cohort, case control)); (3) population demographics (age, gender, stratifications or risk-adjustments, procedure(s) performed from among the nine target procedures); and (4) outcomes (all reported estimates of mortality, readmission and length of stay will be collected as well all reported estimates of cumulative incidence, incidence rate, prevalence and occurrence rates for target complications (mortality, length of stay, pulmonary embolism, pneumonia, sepsis or septic shock, reoperation, surgical site infection, wound dehiscence/disruption, blood transfusion, common bile duct injury, stroke and myocardial infarction)) and outcome definitions. If incidence, prevalence and occurrence rates are not provided, information sufficient to calculate these measures will be collected (extraction sheet appendix 2).

Supplementary Material

Supplementary data

Authors will be contacted both to clarify eligibility of studies as well as to clarify missing or ambiguous information.

Risk of bias assessment

The risks of bias within studies and across studies will be examined using a new tool adapted from published guidelines and existing tools used to assess bias in prognostic studies.21 Rather than presenting a study score, the quality of the study will be described within a number of domains using a risk of bias component-based approach. These domains will include: (1) study participation (the study population sufficiently reflects the population of interest); (2) completeness of study data (minimal loss to follow-up or missing data); (3) exposure to procedure (the procedure is sufficiently defined and the exposure sufficiently measured); (4) outcome measure (the outcome is adequately defined and sufficiently measured); (5) confounding (confounding is controlled for when appropriate and sufficient details about stratification and adjustment are provided); and (6) analysis (the statistical analysis is appropriate to the study design and the nature of the data available). The tool developed to assess risk of bias includes specific criteria to guide the rating for each of these domains (see the risk of bias assessment tool in appendix 3).

Data synthesis

Data will be synthesised first using narrative synthesis methods. This will involve clustering studies based on procedure, outcomes, country and database(s) used. The study characteristics (including population, patient descriptions and exposures) as well as outcomes will be explored in a series of summary tables. Additional stratification based on study dates will be performed. Within subgroups (countries, procedures, outcomes) data will be presented both visually (time lines and graphs) as well as within tables and narrative text as appropriate to describe the outcome results obtained from databases at different time periods. Quantitative comparisons within subgroups will be restricted to obtainable time frames that are similar (within a 2-month range both at start and stop of data collection period).

When different studies report outcomes for the same population within the same time frame using different databases, differences in population selection, data collection and data handling will be compared between these databases. How these differences impact apparent complication rates will be explored. Understanding the differences between how data are captured and reported between databases and the implications of direct comparisons between countries will require sensitivity and contextual appreciation. Health system experts familiar with differing health systems and databases will contribute to interpretation and framing.

Results that are a synthesis of multiple countries will be reported descriptively including a breakdown of participating sites. If the majority of participants (>80%) are from a single country, these results will be considered as representative of that country within the comparative subgroup analyses. If the proportion from any single country is less than 80% or country-specific breakdown is unavailable, these results will be excluded from the comparative analyses.

Statistical analyses

Complications will be summarised across countries and publications when feasible using Mantel-Haenszel weighted DerSimonian and Laird random effects models. Standard errors for estimates of proportions will be determined using the binomial distribution. Prespecified subgroup analyses will be conducted based on location of surgery and the time of study, the procedure and study population of interest, and the outcome of interest.

Different follow-up durations can be anticipated depending on the database and study (eg, 30 days, total hospitalisation or another defined period). When available, mortality, complication and readmission rates will be captured within these defined follow-up periods. Outcome analyses will be stratified according to follow-up duration. Mortality and readmission will be reported as an incidence rate and length of stay as an average (using mean and/or median as appropriate). Complications will be reported as cumulative incidence, incidence rate, prevalence and occurrence rate for each specific complication within the population studied:

Cumulative incidence=Number of new presentations of the complication during the hospitalisation/Total population at risk

where the total population at risk will be defined as the number of hospitalised patients who have undergone the target procedure.

Incidence rate=Number of new presentations of the complication during the hospitalisation/Total person-time at risk

where the person-time will be the sum of the total length of hospitalisation for all patients undergoing the target procedure.

Prevalence=Number of existing cases of the complication at a specific point in time/The total population of patients who have undergone the target procedure

Occurrence rate=Number of cases of the complication during hospitalisation/Total number of patients undergoing the target procedure during the same time frame.

Heterogeneity in summary estimates will be evaluated through calculation of Q and I2 (I2=(Q−df/Q)x100%) statistics. In the setting of heterogeneity, we will conduct further exploratory subgroup analyses and meta-regressions in order to determine whether time periods, a change in the method of performing a surgical procedure (eg, endovascular versus open abdominal aortic aneurysm repair) or study-level risks of biases may influence our summary estimates of complications.

Assessment of small study effects potentially due to publication bias will be assessed through Egger’s test and a visual inspection of funnel plots. All statistical analyses will be performed using Stata V.12.0 (StataCorp, College Station, Texas, USA). Statistical significance will be defined by a two-tailed p value <0.05.

Correlation between systematic review findings and contemporary national data

Outcomes within procedures that demonstrate considerable variability between countries or administrative databases will be identified for correlation within contemporary clinical databases (NSQIP and Dr Foster database). In addition, procedures with high rates of adverse outcomes will likewise be identified for similar correlation. These databases will be specifically interrogated for The identified outcomes within specific procedures within the most contemporary and comparable time frame broken down by country.

Ethics and dissemination

Over the last 20 years, health systems have been made increasingly aware of the international burden of healthcare-related complications.3 7 22 23 Surgical complications contribute substantially to this burden. Our ability to measure complication rates associated with surgical procedures is improving with the increasing sophistication of population-based repositories of clinical data. With an increase in the quality and scope of the data comes the ability to benchmark results within and across nations. Currently, healthcare databases are used primarily for country-specific research and to provide benchmarks against which institutions within a single nation are compared.

Beyond institutional strategies, large-scale national and international interventions offer the potential for decreasing the burden of surgical complications on a global scale. In 2007, the surgical safety team at WHO was charged with developing an intervention to improve surgical outcomes globally. The WHO surgical safety checklist was created to address this international need.24 Although the initial results of this intervention were promising, the long-term impact has been less clear.25 26 In order to monitor and compare national surgical outcomes longitudinally, it is essential that we develop an improved understanding of the information available from national databases.

This systematic review aims to provide the first synthesis of the national rates of complications associated with common surgical procedures across five HICs. Measures of important complications will be compared across countries. The sources of data from which complication rates are generated will be examined to provide a comprehensive picture of both the rates of complications as well as the limitations of the different databases nations use to provide these estimates.

No ethics review is necessary for this paper as it deals only with secondary, published data and no attempts will be made to identify individuals. The results of this systematic review will be presented internationally and published in one or more peer-reviewed, scientific journals. Through the involvement of several international stakeholders in surgical quality improvement from the outset of the study, our results will hopefully be of immediate interest and use to the surgical community, potentially being of immediate use to those focused on improving the quality of surgical care and benchmarking in the future.

In addition, this study will point to the limitations of the data that will need to be acknowledged when interpreting results. The results from this study will be used by this surgical safety research group to develop an implementation strategy aimed at HICs to diminish the burden of preventable surgical complications. Specifically, the results of the study will identify the sources of the greatest burden of potentially preventable surgical complications within and across nations that could be targeted by an intervention and the best sources of reliable longitudinal data to study the impact of this intervention. This process will be used to improve surgical care internationally in the future.

Supplementary Material

Supplementary data


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
View Abstract


  • Contributor MEB outlined the primary aims of the study and helped developed the objectives and general approach to this protocol. She drafted the initial manuscript. DJR refined and developed the methodology of the study including the provision of reference publications and helped to develop the objectives. OD assumed a primary role in developing the objectives of the study as well as the inclusion and exclusion criteria. She identified the key databases and helped with protocol creation. CC provided feedback on different databases and helped to develop the methods of the protocol dealing with database analysis. ABH, WB, MEB, ED and CLF helped develop the aims of the study, provided feedback on early drafts and developed the objectives of the study. PB developed the focus of the literature searches with OD and MEB and performed a pilot search as well as a revised search. All authors have reviewed the protocol, aided in its revision and approve of it in its final form.

  • Competing interests None declared.

  • Patient consent This is a systematic review of publications dealing with national-level data. No individual patient information will be presented.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.