Article Text

Original research
Methodology of economic evaluations in spine surgery: a systematic review and qualitative assessment
  1. Ruud Droeghaag1,2,
  2. Valérie N E Schuermans2,3,4,
  3. Sem M M Hermans1,2,
  4. Anouk Y J M Smeets3,4,
  5. Inge J M H Caelers2,4,
  6. Mickaël Hiligsmann2,5,
  7. Silvia Evers2,5,6,
  8. Wouter L W van Hemert1,
  9. Henk van Santbrink2,3,4
  1. 1Orthopedic Surgery, Zuyderland Medical Centre Heerlen, Heerlen, The Netherlands
  2. 2Caphri School of Public Health and Primary Care, Maastricht University, Maastricht, The Netherlands
  3. 3Neurosurgery, Zuyderland Medical Centre Heerlen, Heerlen, The Netherlands
  4. 4Neurosurgery, Maastricht Universitair Medisch Centrum+, Maastricht, The Netherlands
  5. 5Health Services Research, Maastricht University, Maastricht, The Netherlands
  6. 6Centre of Economic Evaluation & Machine Learning, Trimbos Institute, Netherlands Institute of Mental Health and Addiction, Utrecht, The Netherlands
  1. Correspondence to Dr Ruud Droeghaag; r.droeghaag{at}zuyderland.nl

Abstract

Objectives The present study is a systematic review conducted as part of a methodological approach to develop evidence-based recommendations for economic evaluations in spine surgery. The aim of this systematic review is to evaluate the methodology and quality of currently available clinical cost-effectiveness studies in spine surgery.

Study design Systematic literature review.

Data sources PubMed, Web of Science, Embase, Cochrane, Cumulative Index to Nursing and Allied Health Literature, EconLit and The National Institute for Health Research Economic Evaluation Database were searched through 8 December 2022.

Eligibility criteria for selecting studies Studies were included if they met all of the following eligibility criteria: (1) spine surgery, (2) the study cost-effectiveness and (3) clinical study. Model-based studies were excluded.

Data extraction and synthesis The following data items were extracted and evaluated: pathology, number of participants, intervention(s), year, country, study design, time horizon, comparator(s), utility measurement, effectivity measurement, costs measured, perspective, main result and study quality.

Results 130 economic evaluations were included. Seventy-four of these studies were retrospective studies. The majority of the studies had a time horizon shorter than 2 years. Utility measures varied between the EuroQol 5 dimensions and variations of the Short-Form Health Survey. Effect measures varied widely between Visual Analogue Scale for pain, Neck Disability Index, Oswestry Disability Index, reoperation rates and adverse events. All studies included direct costs from a healthcare perspective. Indirect costs were included in 47 studies. Total Consensus Health Economic Criteria scores ranged from 2 to 18, with a mean score of 12.0 over all 130 studies.

Conclusions The comparability of economic evaluations in spine surgery is extremely low due to different study designs, follow-up duration and outcome measurements such as utility, effectiveness and costs. This illustrates the need for uniformity in conducting and reporting economic evaluations in spine surgery.

  • Health economics
  • Spine
  • Economics

Data availability statement

Data are available upon reasonable request.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

STRENGTHS AND LIMITATIONS OF THIS STUDY

  • This is the first study to systematically review the methodology and quality of economic evaluation in the entire field of spine surgery.

  • This systematic review was reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement and executed in accordance with the five-step approach on preparing a systematic review of economic evaluations by Van Mastrigt et al.

  • The broad search strategy yielded well over 17 000 unique studies, limiting the probability of missing relevant literature.

  • As the scope of this work was limited to assessment of the methodology and quality, we did not include results on cost-effectiveness outcomes reported in the studies.

  • Risk of bias was not deemed relevant as cost-effectiveness outcomes of the studies were not synthesised in this systematic review. Hence, risk of bias was not included.

Introduction

Economic evaluations are increasingly important considering the growing healthcare expenses. The number of people aged 60 years or older is expected to double by 2050.1 As older individuals are more likely to require spine surgery, the amount of spine surgeries is also expected to increase. This, in turn, will result in higher healthcare-related costs.2–4 To limit the increase of spine surgery-related healthcare costs, scarce healthcare resources should be allocated efficiently. Therefore, the most cost-effective surgical technique should be identified and implemented.5 6 The value of economic evaluations is progressively renowned, as reflected by the observed increase in studies mentioning costs and cost-effectiveness in the last decade.7 However, previous literature suggests that the variable quality and reporting of these economic evaluations limits their comparability and practicality. Both in cervical and lumbar spine surgery, systematic literature reviews have shown an apparent lack of uniformity.8 9 This is mainly caused by heterogeneity in study design, study data, hypotheses and conclusions. An important factor, for instance, is the heterogeneity in determining, calculating and reporting cost data.10 Recent systematic reviews in cervical and lumbar spine surgery show that clinical economic evaluations vary largely in healthcare perspective and societal perspective costs due to differences in calculation methods of costs and/or charges, and differences in inclusion and exclusion criteria and baseline characteristics.8 9 The Panel on Cost-Effectiveness in Health and Medicine in the USA recommends performing cost-effectiveness studies from both the healthcare and the societal perspective.11 Nevertheless, only a minority of economic evaluations report on societal perspective costs.12

Kepler et al reviewed the existing economic evidence in spine surgery in 2012.13 This study portrays the lack of homogenous reporting in terms of study design, study population, pathology studied, cost calculations and utility used. Moreover, they observed that only 12% of studies adhered to the recommendations of the US Second Panel of Cost-effectiveness in Health and Medicine. Subsequently, the lack of standardised costing methodology in spine surgery research is also extensively described by Alvin et al and Chang et al.10 14 Both suggest several key aspects of cost calculation. First of all, the perspective of included costs should be considered. Second, the acquisition and definition of costs should be recognised. Payments, charges, costs and expected reimbursements are separate entities that should not be used synonymously.

General guidelines and recommendations regarding the proper conduct of economic evaluations are available, including the Consolidated Health Economic Evaluation Reporting Standards (CHEERS) checklist, the series of Modelling Good Research Practices published by the International Society for Pharmacoeconomics and Outcomes Research and the recommendations for Conduct, Methodological Practices and Reporting of Cost-effectiveness Analyses from the Second Panel on Cost-Effectiveness in Health and Medicine.7 11 15 A limitation of these general guidelines is that by their nature they do not incorporate disease-specific and topic-specific recommendations.12 Compared, and in supplement, with the generally accepted methodologic standards, it would thus be beneficial to have disease-specific guidelines to provide additional recommendations. The lack of homogeneity in economic evaluations regarding spine surgery impedes proper interpretation by healthcare professionals and financial decision-makers. Recommendations to conduct economic evaluations in this field, as a complement to the existing general guidelines, should ameliorate overall research quality, comparability and interpretability.

The present study is a systematic review conducted as part of a methodologic approach to develop evidence-based recommendations for economic evaluations in spine surgery.16 As a first step, it is of importance to have a complete, up-to-date, overview of the current methodology and quality of cost-effectiveness research in spine surgery. As this will enable us to identify the disparity in the current practice and develop adequate recommendations to assess these gaps.

Therefore, the aim of this systematic review is to evaluate the methodology and quality of currently available clinical cost-effectiveness studies in spine surgery. Furthermore, the methodological quality of the included clinical studies is assessed according to the Consensus Health Economic Criteria (CHEC).17 It should be noted that the assessment of the cost-effectiveness of different surgical techniques is not within the scope of this work. We focus solely on the methodology and quality of the studies, as the ultimate goal is to develop a spine-surgery specific guideline for economic evaluations, using a modified Delphi approach. The outcomes of the Delphi approach and the final disease-specific guideline will be published separately.

Materials and methods

Review protocol

This systematic review was reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement and executed in accordance with the five-step approach on preparing a systematic review of economic evaluations by Van Mastrigt et al.18–22 The review protocol consisted of a research question, search strategy and eligibility criteria for assessing full-text studies. The complete study protocol has been published beforehand.16

Since this systematic review is conducted as part of a methodological approach to develop evidence-based recommendations for economic evaluations in spine surgery, a segment of the search is aimed at finding general and disease-specific guidelines on economic evaluations. These studies will be used in drafting statements for a Delphi analysis, and to identify relevant authors, but will not be included in this systematic review.

Search strategy and eligibility criteria

The literature search is conducted using several terms, including, but not limited to: ‘economic evaluation’, ‘cost-effectiveness’ and ‘spine surgery’. The full search strategy can be found in online supplemental appendix file 1. The following databases were searched: PubMed, Web of Science, Embase, Cochrane, Cumulative Index to Nursing and Allied Health Literature, EconLit, The National Institute for Health Research Economic Evaluation Database.

Our last search was conducted on 8 December 2022. Studies were included if they met all of the following eligibility criteria: (1) the study concerns spine surgery, (2) the study investigates and reports on costs and effectiveness and (3) the study is a clinical study using real-world data. In order to provide an up-to-date review of recent studies, we limited the inclusion to studies published in 2011 and onwards. The inclusion of older studies might lead to skewing of data, as methodology and reporting evolve over time.

This review only focuses on trial-based economic evaluations, model-based economic evaluations were thus excluded. This choice was made as the aim of the final guideline is to govern scientists in the conduct and reporting of clinical cost-effectiveness studies.

Study selection and data collection process

Duplicates were removed. Potentially eligible studies were screened based on title and abstract, this screening was performed by two independent authors among four authors (VNES, RD, AYJMS, IJMHC). If necessary, consensus was reached between both authors through discussion or with the assistance of a third reviewer (SMMH). Final selection of studies based on full-text assessment using the abovementioned eligibility criteria was performed by two authors (VNES, RD). Cross-referencing was performed during full-text assessment. Data were collected using a prospectively designed data collection sheet. Data were independently extracted by two authors (VNES, RD).

The following data items were considered: pathology, number of included participants, intervention(s) studied, year, country, study design, time horizon, comparator(s), utility measurement, effectivity measurement, costs measured, perspective, main result and incremental cost-effectiveness ratio (ICER). The complete data collection sheet can be found in online supplemental appendix file 2.

Quality assessment

Two authors (VNES, RD) independently performed quality assessments on the included studies using CHEC list.17 The CHEC-list was chosen as it is the recommended quality-checklist for trial-based economic evaluations.22 23 A CHEC-score of 19 out of 19 points indicates sublime study quality. The comprehensive description of the CHEC-list and CHEC-criteria is displayed in online supplemental appendix file 3. Full CHEC list scores can be found in online supplemental appendix file 4. Consensus was reached between both authors through discussion.

Protocol registration

This review protocol has been published as part of a protocol to develop evidence-based recommendations for economic evaluations in spine surgery.16

Patient and public involvement

No patient was involved.

Results

Study selection

The systematic database search resulted in 27 036 studies. No additional studies were identified through manual searches of relevant reference lists. Results of the study selection process are summarised in figure 1. After removing duplicates, 17 746 studies were screened on title and abstract. In total, 510 studies were eligible for full-text analysis, resulting in exclusion of 380 studies; 139 studies were studies other than clinical trials, 72 studies did not report on an effectivity outcome measure, 50 studies used similar datasets as prior included studies or were duplicates, 47 studies were model-based, 21 studies were abstract-only, 11 studies did not concern spine surgery and 5 studies did not mention costs. Thirty-five studies were identified as useful guidelines on economic evaluations, but were not included in this systematic review. They will be used in a separate study to develop evidence-based recommendations for economic evaluations in spine surgery. Finally, a total of 130 clinical cost-effectiveness studies were included.24–153

Figure 1

PRISMA flowchart. PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

Study characteristics—clinical studies

Table 1 displays the study characteristics of the included clinical cost-effectiveness studies. The majority of the studies (n=74) were conducted in the USA and Europe (n=28). Of the 130 studies, 74 were retrospective analyses, 22 were trial-based models, 22 were prospective cohorts, 12 were randomised controlled trials (RCTs). Most studies concerned lumbosacral spine surgery (n=68), fewer studies concerned cervical (n=23) and spinal deformity surgery (n=22). The majority of the studies had a time horizon of 2 years or shorter (n=95). The EuroQol 5 dimensions (EQ-5D) was the most frequently used utility measurement (n=46) in combination with the variations of the Short-Form Health Survey (eg, SF-36, SF-6D) (n=43). Effect measures varied widely between Visual Analogue Scale for pain, Neck Disability Index (NDI), Oswestry Disability Index, reoperation rates and adverse events. All studies included direct costs from a healthcare perspective.

Table 1

Characteristics of the included studies

Costs of hospitalisation (n=81), procedure (n=70) and pharmaceuticals (n=50) were most often included as direct costs, followed by costs of diagnostics (n=46) and outpatient visits (n=48). In 26 studies, it was not specified which costs were evaluated. A limited number of studies included costs associated with revision surgery, reoperations, readmissions, complications and use of medical devices. Indirect costs were included in 47 studies, of which 41 studies adapted a societal perspective. The indirect costs evaluated mainly consisted of loss of productivity (n=44).

The majority of cost data were collected from the hospital records (n=63) and Medicare records (n=49). Other used cost data sources were Diagnose Related Group codes (n=35), Current Procedural Terminology (n=25) and national or regional databases (n=22). Indirect cost data of lost productivity were estimated based on average wages in most studies. Several studies used patient-reported wages. The complete data extraction sheet of the included studies is summarised in online supplemental appendix file 2.

Quality of identified studies

The methodological quality of the included clinical studies was assessed according to the CHEC.17 Study protocols for clinical economic evaluation studies were similarly assessed. The complete CHEC scores of included studies are summarised in online supplemental appendix file 4. Total CHEC scores ranged from 2 to 18, with a mean score of 12.0 over all 130 studies. The scores for RCTs ranged from 5 to 18, with a mean of 13.7. The scores for prospective cohort studies ranged from 7 to 15, with a mean of 12.2. The scores for retrospective studies ranged from 2 to 17, with a mean of 10.9. A comprehensive overview of total score per study is depicted in figure 2.

Figure 2

Total CHEC score per study. CHEC, Consensus Health Economic Criteria.

Several domains of the CHEC were scored in less than half of the included studies: study design (n=50), perspective (n=42), reporting of ICER (n=65), discounting (n=58), sensitivity analysis (n=57) and ethical considerations (n=4). Various domains were scored in over 80% of the studies: study population (n=124), research question (n=124), study outcome (n=115), measurement of outcome (n=113), value of outcome (n=107) and conclusion (n=114). A comprehensive overview of total score per domain is depicted in figure 3.

Figure 3

Total CHEC score per domain. CHEC, Consensus Health Economic Criteria; ICER, incremental cost-effectiveness ratio.

Discussion

This study provides an extensive and up-to-date overview of the methodology and quality of cost-effectiveness research in spine surgery. Although this is not the first review that describes the current literature concerning cost-effectiveness in spine surgery, it adds new information to the existing literature, as it focuses on the methodology and quality of all clinical economic evaluations in spine surgery. The search conducted for this review was very broad, which resulted in a high number of hits over the different databases. Logically, this has led to a much higher inclusion rate than previous reviews on the topic.10 13 As can be deducted from table 1, the number of publications concerning cost-effectiveness in spine surgery appears to be increasing over recent years. This was similarly described by Husereau et al.7 No additional studies were found through cross-referencing, as all studies found through cross-referencing were already found in the initial search.

We encountered 12 systematic reviews on the subject while screening articles found in our search. Of these studies six reviewed cost-effectiveness of minimally invasive spine surgery,6 9 154–157 two reviewed lumbar spine surgery for spondylolisthesis,158 159 two reviewed all cost-effectiveness studies in spine surgery in the USA,14 160 one reviewed cervical degenerative disease161 and one reviewed vertebroplasty and balloon kyphoplasty for osteoporotic vertebral compression fractures.162

The number of included studies in each review ranged from 5 to 58 studies, with a mean of 22 included studies per review. One study did not mention the total number of studies included.155

For quality assessment, four reviews used the Quality of Health Economic Studies,10 154 158 160 163 two reviews used the CHEC,9 17 159 one review used the CHEERS,7 14 one review used the Grading of Recommendations Assessment, development and Evaluation156 164 and one review used a Cochrane Working Group Tool.157 165 The remaining three studies did not assess study quality.6 155 162 All preceding systematic reviews on the cost-effectiveness in spine surgery conclude that the quality of the economic evaluations in the field is low to moderate. As a consequence, none of the reviews is able to draw firm conclusions based on the existing literature. Furthermore, the majority of these reviews conclude that economic evaluation studies of higher quality are required.

Among the included economic evaluations in this systematic review, there is a high degree of variation in study designs. The majority of studies are retrospective analyses, which is regarded suboptimal for the evaluation of cost-effectiveness. Thereby, there is also a large variation in the time-horizon used among the studied economic evaluations. It is noteworthy that the majority of studies have a follow-up period of less than 2 years. In spine surgery, and in cost-effectiveness research specifically, it is recommended to incorporate an adequate follow-up duration, which is recommended to be at least 2–4 years.166 Moreover, an adequate choice of both the intervention and comparator is essential for the conduct of a proper cost-utility analysis or cost-effectiveness analysis.167 The use of utility and effectivity measurement tools is highly inconsistent. One of the included studies even showed that the ICER differed strongly depending on the use of NDI or SF-36, when they were both evaluated.168 Another study also showed a significant difference in cost-effectiveness depending on the use of utility measurement: when using EQ-5D, the intervention was cost-effective, however it was not cost-effective when using SF-6D.96 This shows that utility measurements cannot be used interchangeably, and consequently the ICERs cannot be compared between studies. Moreover, there is no equality in the costs, charges and/or reimbursements included in the studies. Not even half of the included studies included a societal perspective, despite this being strongly recommended.11 As seen in table 1, consistency in the use and reporting of direct and indirect costs is lacking. Moreover, cost data are collected from various sources in multiple ways. Noteworthy, many studies did not specify the included costs data and data sources. Some of the studies also considered the inclusion of ‘indirect’ hospital costs (eg, washing of bed linen) as the inclusion of indirect costs, thus wrongfully reported the study to be conducted from a societal perspective.169

The difference of included cost utility and/or effectivity measurements causes the denominator and the divisor to vary so strongly that no conclusions can be drawn concerning cost-effectiveness. Considering that back-related complaints are the leading cause for disability globally, it is astonishing to see that there is no consensus concerning cost-effectiveness in spine surgery.170 Even the high(er) quality economic evaluations do not provide sufficient insight at this moment, as the outcomes are not comparable to any other studies. Taking into account all of the study characteristics and outcome measurements, barely any outcomes of the 108 included studies concerning cost-effectiveness in spine surgery can be compared. This reveals the current absence in government, or anarchy, in the conduct of cost-effectiveness research.

The quality of the studies based on the CHEC is low to moderate. Only a limited number of studies are of high quality. It is noticeable that several domains of the CHEC were scored in less than half of the included studies. Especially domains that are highly specific for cost-effectiveness studies were lacking. For example, the use of discounting and performing a sensitivity analysis were specifically deficient in many studies.171–173 Apparently, judging the heterogeneity and quality in the current literature, general guidelines are insufficiently adhered to or too unspecific. We advocate that not only studies of higher quality, but especially of higher comparability are required to determine the cost-effectiveness of interventions in spine surgery.

As mentioned in the introduction section, the lack of homogeneity in economic evaluations regarding spine surgery impedes proper interpretation by healthcare professionals and financial decision-makers. This lack of homogeneity has been mentioned numerous times in previous systematic reviews, and is once again confirmed by this present review. To ameliorate the research quality, comparability and interpretability, disease-specific recommendations for the conduct of economic evaluations in the field of spine surgery, as a complement to existing general guidelines, are needed. These recommendations will be developed by a group of experts and validated in a Delphi process. By gathering a diverse group of experts that will reach consensus concerning methodology in cost-effectiveness research, variability of future studies can be reduced, thus increasing overall research quality.

The outcomes of this review serve as a basis to develop these evidence-based recommendations for economic evaluations in spine surgery.16

Limitations

This systematic review is subject to several constraints. We did not perform an in-depth review of the content of the included studies since the purpose of this study was to evaluate the methodological quality of the economic evaluations specifically. Besides, the great variety in interventions and comparators among the included studies impedes proper topic-specific in-depth reviews of the content of these studies. Assessing the risk of bias is of limited value in this review, as this concerns robustness of the outcomes and conclusions of a study. Whereas we solely focused on the methodology and quality of the studies, we did not perform a risk of bias assessment. We suggest that topic-specific systematic reviews on cost-effectiveness of the various interventions in spine surgery are to be performed separately.

This systematic review on cost-effectiveness studies in spine surgery was limited to clinical studies and previously published systematic reviews and does therefore not include model-based studies. Several model-based studies are based on clinical trial data, these trial-based model studies were thus included. The exclusion of model-based studies logically limits the conclusions of this work to clinical economic evaluation studies. We choose to exclude model studies as the aim of our guideline is to govern scientists in the conduct and reporting of clinical cost-effectiveness studies. We believe that the current limitations in cost-effectiveness research arise from study design and data collection in clinical studies, rather than modelling. Additionally, we are of the opinion that general guidelines for modelling are sufficient.

Conclusion

This systematic review shows that the number of economic evaluations in the field of spine surgery is increasing. However, the quality of these studies remains low to moderate. More importantly, the comparability of the study remains extremely low due to different study designs, follow-up duration and outcome measurements such as utility, effectiveness and costs. As a result of these differences in methodology and reporting, current studies are not comparable. This illustrates the current anarchy in cost-effectiveness research and the consequent need for uniformity in conduct and reporting of economic evaluations in spine surgery.

Data availability statement

Data are available upon reasonable request.

Ethics statements

Patient consent for publication

Ethics approval

Not applicable.

References

Supplementary materials

Footnotes

  • RD and VNES contributed equally.

  • Contributors RD, VNES, WLWvH, SE, HvS contributed to conception. RD, VNES, MH, SE, HvS contributed to design. RD, VNES, SMMH, AYJMS, IJMHC contributed to data acquisition. RD, VNES contributed to analysis. RD, VNES, MH, WLWvH, SE, HvS contributed to interpretation. RD, VNES, SMMH, AYJMS, IJMHC, MH, WLWvH, SE, HvS contributed to substantial revision. MH, WLWvH, SE, HvS contributed to supervision. HvS was responisble as guarantor.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.