Article Text

Original research
Risk assessment models for venous thromboembolism in hospitalised adult patients: a systematic review
  1. Abdullah Pandor1,
  2. Michael Tonkins1,
  3. Steve Goodacre1,
  4. Katie Sworn1,
  5. Mark Clowes1,
  6. Xavier L Griffin2,
  7. Mark Holland3,
  8. Beverley J Hunt4,
  9. Kerstin de Wit5,
  10. Daniel Horner6
  1. 1ScHARR, The University of Sheffield, Sheffield, UK
  2. 2Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
  3. 3Department of Clinical and Biomedical Sciences, University of Bolton, Bolton, UK
  4. 4Department of Haematology, Guy's and St Thomas' NHS Foundation Trust, London, UK
  5. 5Department of Medicine, McMaster University, Hamilton, Ontario, Canada
  6. 6Emergency Department, Salford Royal NHS Foundation Trust, Salford, UK
  1. Correspondence to Abdullah Pandor; a.pandor{at}sheffield.ac.uk

Abstract

Introduction Hospital-acquired thrombosis accounts for a large proportion of all venous thromboembolism (VTE), with significant morbidity and mortality. This subset of VTE can be reduced through accurate risk assessment and tailored pharmacological thromboprophylaxis. This systematic review aimed to determine the comparative accuracy of risk assessment models (RAMs) for predicting VTE in patients admitted to hospital.

Methods A systematic search was performed across five electronic databases (including MEDLINE, EMBASE and the Cochrane Library) from inception to February 2021. All primary validation studies were eligible if they examined the accuracy of a multivariable RAM (or scoring system) for predicting the risk of developing VTE in hospitalised inpatients. Two or more reviewers independently undertook study selection, data extraction and risk of bias assessments using the PROBAST (Prediction model Risk Of Bias ASsessment Tool) tool. We used narrative synthesis to summarise the findings.

Results Among 6355 records, we included 51 studies, comprising 24 unique validated RAMs. The majority of studies included hospital inpatients who required medical care (21 studies), were undergoing surgery (15 studies) or receiving care for trauma (4 studies). The most widely evaluated RAMs were the Caprini RAM (22 studies), Padua prediction score (16 studies), IMPROVE models (8 studies), the Geneva risk score (4 studies) and the Kucher score (4 studies). C-statistics varied markedly between studies and between models, with no one RAM performing obviously better than other models. Across all models, C-statistics were often weak (<0.7), sometimes good (0.7–0.8) and a few were excellent (>0.8). Similarly, estimates for sensitivity and specificity were highly variable. Sensitivity estimates ranged from 12.0% to 100% and specificity estimates ranged from 7.2% to 100%.

Conclusion Available data suggest that RAMs have generally weak predictive accuracy for VTE. There is insufficient evidence and too much heterogeneity to recommend the use of any particular RAM.

PROSPERO registration number Steve Goodacre, Abdullah Pandor, Katie Sworn, Daniel Horner, Mark Clowes. A systematic review of venous thromboembolism RAMs for hospital inpatients. PROSPERO 2020 CRD42020165778. Available from https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=165778https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=165778

  • Vascular Medicine
  • Haematology
  • Anticoagulation
  • Quality in health care

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

https://creativecommons.org/licenses/by/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • This systematic review provides an up-to-date comprehensive review of risk assessment models for predicting venous thromboembolism in patients admitted to hospital.

  • The newly developed PROBAST (Prediction model Risk Of Bias ASsessment Tool) tool was used to evaluate the risk of bias and applicability of the available evidence.

  • Heterogeneity in the included studies (participants, inclusion criteria, clinical condition, outcome definition and measurement) and variable reporting of items precluded meta-analysis.

  • Limitations of the existing evidence and areas of future research are highlighted.

Introduction

Venous thromboembolism (VTE) is an important and life-threatening complication of hospitalisation and illness, and is associated with significant morbidity and mortality.1 2 Globally, an estimated 10 million VTE episodes are diagnosed each year; over half of these episodes are associated with hospital inpatients stays and result in significant loss of disability-adjusted life years.3 4 Consequently, there has been a substantial and sustained focus on VTE prevention over the last three decades, with good evidence indicating a reduction in morbidity with primary thromboprophylaxis in hospitalised patients.5–8 Despite this evidence, thromboprophylaxis remains either underused or inappropriately applied.9

Risk assessment models (RAMs) have been developed to help stratify the risk of VTE among hospitalised patients.10 These models use clinical information from the patient’s history and examination to identify those with an increased risk of developing VTE who are most likely to benefit from pharmacological prophylaxis. Inappropriate use of VTE prophylaxis may not reduce VTE rates and may cause unnecessary harm.11 While RAMs could improve the ratio of benefit to risk and benefit to cost, it is unclear which VTE RAM should be applied to guide decision-making for prophylaxis in clinical practice and thereby optimise patient care.

The current review extends and updates three broadly overlapping existing reviews.10 12 13 While these reviews identified the use of various (derived and validated) RAMs for VTE in hospitalised patients, they did not find any evidence to suggest which RAM was superior. The aim of this systematic review was to identify primary validation studies (as derivation studies may give an overoptimistic assessment of model performance measures) and determine the accuracy of individual RAMs for predicting the risk of developing VTE in hospital inpatients.

Methods

A systematic review was undertaken in accordance with the general principles recommended in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement.14 This review was part of a larger project on VTE RAMs for hospital inpatients15 and was registered on the International Prospective Register of Systematic Reviews (PROSPERO) database (CRD42020165778).

Eligibility criteria

We sought studies evaluating RAMs which could be applied to a general inpatient population (medical, surgical or trauma) rather than disease-specific models. All primary validation studies that evaluated the accuracy (eg, sensitivity, specificity, C-statistic) of a multivariable RAM (or scoring system) for predicting the risk of developing VTE were eligible for inclusion. We selected studies that included validation of the model in a group of patients that were not involved in model derivation. This involved either splitting the study cohort (internal) or using a new cohort (external). The study could have reported derivation of the model but we only used the validation data to estimate accuracy. The study population consisted of hospital inpatients including those who required medical care, undergoing any surgery (excluding day surgery) or received care following an injury. Studies that primarily focused on children (aged under 16 years), women admitted to hospital for pregnancy-related reasons and any patient admitted to a level 2 or above critical care environment (eg, patients requiring more detailed observation or intervention including support for a single failing organ system or postoperative care and those ‘stepping down’ from higher levels of care) were excluded. These patient groups have VTE risk profiles that differ markedly from the general inpatient population, making the use of a generic model inappropriate.

Data sources and searches

Potentially relevant studies were identified through searches of five electronic databases including MEDLINE (with MEDLINE In-process and Epub Ahead of Print), EMBASE and the Cochrane Library. The search strategy used free text and thesaurus terms and combined synonyms relating to the condition (eg, VTE in medical inpatients) with risk prediction modelling terms. No language restrictions were used. However, as the current review updated three previous systematic reviews,10 12 13 searches were limited by date from 2017 (last search date from earlier reviews)10 to February 2021. Searches were supplemented by hand-searching the reference lists of all relevant studies (including existing systematic reviews); forward citation searching of included studies; contacting key experts in the field; and undertaking targeted searches of the World Wide Web using the Google search engine. Further details on the search strategy can be found in online supplemental appendix S1.

Study selection

All titles were examined for inclusion by one reviewer (KS) and any citations that clearly did not meet the inclusion criteria (eg, non-human, unrelated to VTE inpatients) were excluded. All abstracts and full-text articles were then examined independently by two reviewers (KS and AP). Any disagreements in the selection process were resolved through discussion or if necessary, arbitration by a third reviewer (SG) and included by consensus.

Data extraction and quality assessment

Data relating to study design, methodological quality and outcomes were extracted by one reviewer (KS) into a standardised data extraction form and independently checked for accuracy by a second (AP or MT). Any discrepancies were resolved through discussion to achieve agreement. Where differences were unresolved, a third reviewer’s opinion was sought (SG). Where multiple publications of the same study were identified, data were extracted and reported as a single study.

The methodological quality of each included study was assessed using PROBAST (Prediction model Risk Of Bias ASsessment Tool).16 17 This instrument evaluates four key domains: patient selection, predictors, outcome and analysis. Each domain is assessed in terms of risk of bias and the concern regarding applicability to the review (first three domains only). To guide the overall domain-level judgement about whether a study is at high, low or an unclear (in the event of insufficient data in the publication to answer the corresponding question) risk of bias, subdomains within each domain include a number of signalling questions to help judge with bias and applicability concerns. An overall risk of bias for each individual study was defined as low risk when all domains were judged as low; and high risk of bias when one or more domains were considered as high. Studies were assigned an unclear risk of bias if one or more domains were unclear and all other domains were low.

Data synthesis and analysis

We were unable to perform meta-analysis due to significant levels of heterogeneity between studies (participants, inclusion criteria, clinical condition) and variable reporting of items. As a result, a prespecified narrative synthesis approach18 19 was undertaken, with data being summarised in tables with accompanying narrative summaries that included a description of the included variables, statistical methods and performance measures (eg, sensitivity, specificity and C-statistic (a value between 0.7 and 0.8 and >0.8 indicated good and excellent discrimination, respectively; and values <0.7 were considered weak20), where applicable. All analyses were conducted using Microsoft Excel V.2010 (Microsoft Corporation, Redmond, Washington, USA).

Patient and public involvement

Patients and the public were not involved in the design or conduct of this systematic review.

Results

Study flow

Figure 1 summarises the process of identifying and selecting relevant literature. Of the 6355 citations identified, 51 studies investigating 24 unique RAMs met the inclusion criteria. The majority of the articles were excluded primarily for not using a RAM for predicting the risk of developing VTE, having no useable or relevant outcome data or an inappropriate study design (eg, derivation study, reviews, commentaries or editorials). A full list of excluded studies with reasons for exclusion is provided in online supplemental appendix S2.

Figure 1

Study flowchart. RAM, risk assessment model; VTE, venous thromboembolism.

Study and patient characteristics

The design and participant characteristics of the 51 included studies21–71 are summarised in table 1. All studies were published between 2003 and 2020 and were undertaken in North America (n=24),23 25 33–40 43 47 48 52–59 65 68 69 Asia (n=13),29 30 42 44–46 60–63 67 70 71 Europe (n=9),22 24 26–28 31 49 51 66 the Middle-East (n=2),21 64 South America (n=1),32 Australia (n=1)41 and one study was intercontinental.50 Sample sizes ranged from 7040 to 1 099 09343 patients in 37 observational cohort studies (11 prospective21 22 24 28 29 32 44 51 52 56 64 (5 of which were multicentre) and 26 retrospective23 25–27 33 34 36–41 43 46 49 50 53–55 58 59 62 63 65 68 69 (16 of which were multicentre) in design). Sample sizes in 14 case–control studies30 31 35 42 45 47 48 57 60 61 66 67 70 71 (4 of which were multicentre) ranged from 14861 to 19 21757 patients.

Table 1

Study and population characteristics

The vast majority of studies evaluated VTE risk assessment in hospital inpatients who required medical care (n=21),24 26–28 31 32 36 37 45 47 49–51 57 58 61 66 67 69–71 were undergoing surgery (n=15)23 25 33 35 38 40 43 46 48 52 56 59 63 65 68 or were a mixed medical and surgical cohort (n=4).22 29 30 34 The remaining studies focused on patients receiving care for trauma (n=4),39 41 55 62 cancer (n=4),21 42 54 60 stroke (n=1),44 burn injuries (n=1)53 and sepsis (n=1).64 The mean age ranged from 45 years39 to 76 years50 (not reported in 29 studies)22–25 30 31 33–35 38 40–45 47 48 52 55–58 61 62 65 66 70 71 and the proportion of female subjects ranged from 17%40 to 81%59 (not reported in 12 studies).22 23 25 33 35 43 48 52 55 56 58 61

VTE definition and case ascertainment

The majority of studies (n=37)21 23 24 26–32 36–38 40–47 49–52 55–58 60 62–64 66 67 70 71 defined the VTE endpoint (DVT and or PE) as being objectively confirmed. Of the remainder, 3 studies34 54 69 had no objective confirmation of VTE and 11 studies22 25 33 35 39 48 53 59 61 65 68 did not report the methods for diagnosis confirmation. In terms of VTE risk period, half of the studies (n=23)21–26 28 35–38 40 43 44 47 49–52 56 57 59 69 used the RAMs to predict the occurrence of VTE within 3 months of the index hospitalisation. The remaining studies did not report the VTE risk period. The reported incidence of VTE ranged widely from 0.3%32 68 to 27.9%,46 depending on definition, study design and study participants (eg, medical, surgical or trauma).

RAMs

The studies included in this review evaluated 24 validated unique RAMs. The most widely evaluated models were the Caprini RAM (22 studies),21 23 29–33 35 36 38 40 42 43 45 46 48 49 59 60 63 70 71 Padua prediction score (16 studies),24 27 28 30 31 34 37 45 48 49 63 64 66 67 70 71 IMPROVE models (8 studies),27 28 31 37 47 49 50 57 the Geneva risk score (4 studies)26–28 31 and the Kucher score (4 studies).31 37 66 69 A summary of their associated characteristics and composite clinical variables is provided in online supplemental appendix S3.

Statistical methods

Statistical methods varied significantly between studies. Most studies reported the discrimination of the RAMs using a combination of the C-statistic and sensitivity or specificity. A minority reported calibration measures, such as the Hosmer-Lemeshow test.23 40 41 50

Risk of bias and applicability assessment

The overall methodological quality of the 51 included studies21–71 is summarised in table 2 and figure 2. The methodological quality of the included studies was variable, with most studies having high or unclear risk of bias in at least one item of the PROBAST tool. The main sources of potential bias were related to the following domains:

  1. Patient selection factors, such as retrospective data collection, incomplete patient enrolment or unclear criteria for patients receiving VTE prophylaxis.

  2. Predictor and outcome bias arising from inappropriate inclusion of predictors within RAMs, unclear methods of outcome definition, low event rates and missing predictor or outcome data.

  3. Analysis factors, such as small sample sizes, inappropriate handling of missing data and failure in reporting relevant performance measures such as calibration.

Figure 2

PROBAST (Prediction model Risk Of Bias ASsessment Tool) assessment summary graph—review authors’ judgements.

Table 2

Summary of each study’s risk of bias and applicability concern using the PROBAST (Prediction model Risk Of Bias ASsessment Tool) tool—review authors’ judgements

Assessment of applicability to the review question led to the majority of studies being classed either as high (n=35)21 22 29 30 32 34 35 38–49 52–55 59–68 70 71 or unclear (n=12)23 24 27 28 31 33 50 51 56–58 69 risk of inapplicability. These assessments were generally related to patient selection (highly selected study populations, eg, single pathologies, single site settings), predictors (inconsistency in definition, assessment or timing of predictors) and outcome determination.

Predictive performance of VTE RAMs (summary of results)

As there were a reasonable number of studies to compare, a summary of the C-statistics for studies involving medical, surgical and trauma patients respectively is presented in figure 3a–c, with the results grouped by RAM. Results of other hospital inpatients are presented in online supplemental appendix S4. C-statistics varied markedly between these studies and between models, with no RAM performing obviously better than other models. In studies evaluating a single model, C-statistics20 were sometimes weak (<0.7; 10 studies with 17 data points), often good (0.7–0.8; 17 studies with 20 data points) and a few were excellent (>0.8; 5 studies with 5 data points). There was marked heterogeneity between multiple studies evaluating the same model. Studies evaluating multiple (more than 3) models31 37 tended to report weak accuracy across all the models (C-statistic <0.7; 2 studies with 16 data points).

Figure 3

C-statistics by model for studies involving (a) medical, (b) surgical and (c) trauma inpatients. ACS NSQIP, American College of Surgeons National Surgical Quality Improvement Program; CI, confidence interval; DVT, deep vein thrombosis; NR, not reported; PE, pulmonary embolism; RAP, Risk Assessment Profile; TESS, Trauma Embolic Scoring System; VTE, venous thromboembolism.

Table 3 shows the sensitivity and specificity at various thresholds for studies involving medical, surgical and trauma patients respectively, with the results grouped by RAM. Interpretation was again limited by marked heterogeneity, which was exacerbated when different thresholds were reported by different studies evaluating the same model. Model accuracy was generally poor, with high sensitivity usually reflecting a threshold effect, as evidenced by corresponding low specificity (and vice versa).

Table 3

Sensitivity and specificity for studies involving medical, surgical and trauma inpatients

Discussion

Summary of results

In this systematic review of 51 observational studies evaluating RAMs for predicting the risk of developing VTE in hospital inpatients, we found that VTE RAMs have generally weak predictive accuracy. The studies validating these models are heterogeneous and most have a high risk of bias. Lack of methodological clarity was common, leading to difficulty in assessing the applicability of the individual study results.

Interpretation of results

We were unable to undertake meta-analysis or statistical examination of the causes of the observed heterogeneity. Potential sources of heterogeneity include variation in study design, the study population, how RAMs are implemented, outcome definition and measurement, and the use of thromboprophylaxis. The latter point warrants further attention. Thromboprophylaxis was employed in about half (n=25) of the studies,21 22 24 26 28 30 34 36–38 40 42 44 46 49–52 54 57–59 64 70 71 with the proportion receiving thromboprophylaxis ranging from 3.8%42 to 100%.46 50 It was not employed in 3 studies,32 61 63 and 23 studies23 25 27 29 31 33 35 39 41 43 45 47 48 53 55 56 60 62 65–69 did not report on thromboprophylaxis use. The use of thromboprophylaxis may lead to underestimation of predictive accuracy if a given RAM were to predict VTE events that were subsequently prevented by thromboprophylaxis. Limited reporting of thromboprophylaxis use precludes further analysis of its impact on the performance of the RAMs.

Comparison to the existing literature

The present review is the largest and most comprehensive systematic review in this field to date. It includes 18 recent studies26–31 33 42 48–50 60–63 66 67 70 published since the completion of the previous systematic review.10 12 13 These studies are consistent with the previous literature in that they report modest performance of the assessed RAMs, with limitations in methodology and reporting preventing further analysis. The conclusion of this review therefore concurs with previous systematic reviews: there is insufficient evidence to recommend one RAM over another.

Strengths and limitations

This systematic review has a number of strengths. The review was conducted with robust methodology in accordance with the PRISMA statement and the protocol was registered with the PROSPERO register. Clinical experts were involved throughout as checkers and to assess the validity and applicability of research during the review. We reported descriptive statistics to provide insight into the limited evidence base applicable to the subject matter, and the scientific concerns regarding validity of the data. However, there are a number of potential weaknesses. Decisions on study relevance, information gathering and validity were unblinded and could potentially have been influenced by pre-formed opinions. However, masking is resource intensive with uncertain benefits. The studies of risk prediction were a combination of prospective cohorts and retrospective health database registries. Both have significant limitations. Retrospective studies of health database registries may have large numbers but may be limited by poor data quality and failure to accurately ascertain outcomes. Prospective cohorts may have better quality data but with smaller numbers lack statistical power. The included studies demonstrated high levels of heterogeneity so we were unable to undertake any meta-analysis.

Implications for policy, practice and future research

Guidelines from the American College of Chest Physicians (ACCP)72 73 and the UK National Institute for Health and Care Excellence (NICE)10 suggest using a validated RAM to guide the decision on whether to prescribe thromboprophylaxis. This review identifies all relevant RAMs and their validation studies. The reported results are insufficient to recommend one RAM over another. A RAM with weak predictive accuracy may still be better than no RAM at all but it is unclear whether RAMs predict VTE risk better than unstructured clinical assessment. Further research is clearly needed but routine use of thromboprophylaxis may present an insurmountable barrier to generating accurate and precise estimates of the prognostic accuracy of RAMs. The evidence that thromboprophylaxis is effective means that it is unethical to withhold thromboprophylaxis when a significant risk of VTE is identified. This inevitably reduces the number of VTE events in any study and confounds the association between risk factors and VTE events. Further studies of RAM accuracy will add little to our review unless they can address this issue.

Alternative approaches therefore need to be considered. Decision-analytic modelling can use existing data to explore the trade-off between the benefits and harms of thromboprophylaxis and identify key uncertainties for future primary research. The data presented in our review show how well RAMs predict VTE but do not tell us the threshold score on the RAM at which thromboprophylaxis should be given to maximise prevention of VTE and minimise harm from bleeding. This may be a more important determinant of RAM effectiveness than predictive accuracy for VTE. Le et al74 suggested thromboprophylaxis is beneficial and cost-effective if a patient’s VTE risk exceeds 1%. Further work to improve RAMs to help stratify the risk of VTE in different types of hospitalised patients could focus on using decision-analytic modelling to compare the effects, harms and costs of giving thromboprophylaxis to patients with varying risk of VTE. This would allow determination of the risk threshold at which thromboprophylaxis provides optimal overall benefit.

Findings from decision-analytic modelling would require validation through primary research. The limitations of undertaking accuracy studies in populations where thromboprophylaxis is routinely used mean that future research should focus on research that compares the effectiveness of different risk assessment approaches. Observational studies could draw on variation in practice to compare outcomes between different risk assessment methods. Alternatively, a controlled trial could compare risk assessment methods in low-risk patients where existing evidence (synthesised using decision-analytic modelling) suggests the benefits of thromboprophylaxis are uncertain.

Conclusions

We identified a number of validated RAMs for potential risk stratification of hospitalised inpatients. The available evidence is insufficient to recommend one over another.

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

Ethics statements

Acknowledgments

The authors would like to thank all additional members of the core project group for NIHR HTA 127454 for input and commentary throughout the work. We are also indebted to Helen Shulver for assistance with logistics and administration.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Twitter @xlgriffin, @kerstindewit

  • Contributors AP coordinated the study. SG, DH, AP, XG, MH, BH and KW were responsible for conception, design and obtaining funding for the study. MC developed the search strategy, undertook searches and organised retrieval of papers. AP, KS, MT and SG were responsible for the acquisition, analysis and interpretation of data. SG, MT, DH, XG, MH, BH and KW helped interpret and provided a methodological, policy and clinical perspective on the data. AP, MT and SG were responsible for the drafting of this paper, although all authors provided comments on the drafts, read and approved the final version. AP is the guarantor for the paper.

  • Funding This study was funded by the United Kingdom National Institute for Health Research Health Technology Assessment Programme (project number 127454). The views expressed in this report are those of the authors and not necessarily those of the NIHR HTA Programme. Any errors are the responsibility of the authors. The funders had no role in the study design, in the collection, analysis and interpretation of data; in the writing of the manuscript; and in the decision to submit the manuscript for publication.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.