Risk assessment models for venous thromboembolism in pregnancy and in the puerperium: a systematic review

Objectives To assess the comparative accuracy of risk assessment models (RAMs) to identify women during pregnancy and the early postnatal period who are at increased risk of venous thromboembolism (VTE). Design Systematic review following Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. Data sources MEDLINE, Embase, Cochrane Library and two research registers were searched until February 2021. Eligibility criteria All validation studies that examined the accuracy of a multivariable RAM (or scoring system) for predicting the risk of developing VTE in women who are pregnant or in the puerperium (within 6 weeks post-delivery). Data extraction and synthesis Two authors independently selected and extracted data. Risk of bias was appraised using PROBAST (Prediction model Risk Of Bias ASsessment Tool). Data were synthesised without meta-analysis. Results Seventeen studies, comprising 19 externally validated RAMs and 1 internally validated model, met the inclusion criteria. The most widely evaluated RAMs were the Royal College of Obstetricians and Gynaecologists guidelines (six studies), American College of Obstetricians and Gynecologists guidelines (two studies), Swedish Society of Obstetrics and Gynecology guidelines (two studies) and the Lyon score (two studies). In general, estimates of sensitivity and specificity were highly variable with sensitivity estimates ranging from 0% to 100% for RAMs that were applied to antepartum women to predict antepartum or postpartum VTE and 0% to 100% for RAMs applied postpartum to predict postpartum VTE. Specificity estimates were similarly diverse ranging from 28% to 98% and 5% to 100%, respectively. Conclusions Available data suggest that external validation studies have weak designs and limited generalisability, so estimates of prognostic accuracy are very uncertain. PROSPERO registration number CRD42020221094.


INTRODUCTION
3][4] The risks substantially increase during the postpartum period (6 weeks post-delivery) 5 and can be as high as 60-fold in some individuals compared with age-matched non-pregnant women. 6Preventative treatment with lowdose anticoagulation (thromboprophylaxis) has the potential to reduce the risk of symptomatic and asymptomatic VTE in pregnancy and the postpartum period. 5Consequently, various prominent international guidelines recommend targeted thromboprophylaxis for pregnant and puerperal women deemed to be at high risk of VTE. 5 7-13 However, these expert-based consensus guidelines vary substantially with regards to the threshold of risk (based on certain risk factors) and the timing, dose and duration of pharmacological thromboprophylaxis.
Risk assessment models (RAMs) have been developed to help stratify the risk of VTE during pregnancy and the early

STRENGTHS AND LIMITATIONS OF THIS STUDY
⇒ A number of risk assessment models for venous thromboembolism (VTE) in pregnancy and puerperium have been developed using a variety of methods and based on a variety of predictor variables.⇒ This systematic review provides a comprehensive review of risk assessment models for predicting the risk of developing VTE in women who are pregnant or in the puerperium (within 6 weeks post-delivery).⇒ The newly developed PROBAST (Prediction model Risk Of Bias ASsessment Tool) was used to evaluate the risk of bias and applicability of the available evidence.⇒ Heterogeneity in the included studies (participants, inclusion criteria, clinical condition, outcome definition and measurement) and variable reporting of items precluded meta-analysis.⇒ Limitations of the existing evidence and areas of future research are highlighted.
on September 16, 2023  Open access postnatal period.These models use clinical information from the patient's history and examination to identify those with an increased risk of developing VTE who are most likely to benefit from pharmacological thromboprophylaxis.Inappropriate use of VTE prophylaxis may not reduce VTE rates and may cause unnecessary harm especially through bleeding and bruising. 14While RAMs could improve the ratio of benefit to risk and benefit to cost, it is unclear which VTE RAM are best applied to guide decisionmaking for thromboprophylaxis in clinical practice and thereby optimise patient care.The aim of this systematic review was to identify primary validation studies and determine the accuracy of individual RAMs that identify pregnant and postpartum women at increased risk of developing VTE who could be selected for thromboprophylaxis.

METHODS
A systematic review was undertaken in accordance with the general principles recommended in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. 15This review was part of a larger project on Thromboprophylaxis in pregnancy and after delivery 16 and was registered on the International Prospective Register of Systematic Reviews (PROSPERO) database.

Eligibility criteria
All studies evaluating the accuracy (eg, sensitivity, specificity, C-statistic) of a multivariable RAM (or scoring system) for predicting the risk of developing VTE were eligible for inclusion.We primarily sought and selected studies that included validation of the model in a group of patients that were not involved in the development of the prediction model.Although the included studies could have reported derivation of the model (for internal validation), we only used the external validation data to estimate accuracy, where appropriate.The study population of interest in our review consisted of pregnant and postpartum (within 6 weeks post-delivery) women who are at increased risk of developing a VTE and receiving care in both hospital, community and primary care settings.Studies that focused on non-pregnant women were excluded as these patient groups have VTE risk profiles that differ markedly from the obstetric population.

Data sources and searches
Potentially relevant studies were identified through searches of several electronic databases and research registers.This included MEDLINE (OvidSP from 1946), Embase (OvidSP from 1974), the Cochrane Library (https://www.cochranelibrary.comfrom inception), ClinicalTrials.gov (US National Institutes of Health from 2000) and the International Clinical Trials Registry Platform (WHO from 1990).All searches were conducted from inception to February 2021.The search strategy used free text and thesaurus terms and combined synonyms relating to the condition (eg, VTE in pregnant and postpartum women) with risk prediction modelling terms. 17o language or date restrictions were used.Searches were supplemented by hand-searching the reference lists of all relevant studies (including existing systematic reviews); forward citation searching of included studies; contacting key experts in the field; and undertaking targeted searches of the World Wide Web using the Google search engine.Further details on the search strategy can be found in the online supplemental appendix S1.

Study selection
All titles were examined for inclusion by one reviewer (GR) and any citations that clearly did not meet the inclusion criteria (eg, non-human, unrelated to VTE in pregnancy and the puerperium) were excluded (for quality assurance a random subset of 20% was checked by a second reviewer (AP)).All abstracts and full-text articles were then examined independently by two reviewers (GR and AP).Any disagreements in the selection process were resolved through discussion or if necessary, arbitration by a third reviewer (JD) or the wider group (BJH, CN-P, SG) and included by consensus.

Data extraction and quality assessment
For eligible studies, data relating to study design, methodological quality and outcomes were extracted by one reviewer (GR) into a standardised data extraction form and independently checked for accuracy by a second reviewer (AP).Any discrepancies were resolved through discussion, or if this was unsuccessful, a third reviewer's opinion was sought (JD).Where multiple publications of the same study were identified, data were extracted and reported as a single study.
The methodological quality of each included study was assessed using PROBAST (Prediction model Risk Of Bias ASsessment Tool). 18 19This instrument includes four key domains: participants (eg, study design and patient selection), predictors (eg, differences in definition and measurement of the predictors), outcome (eg, differences related to the definition and outcome assessment) and statistical analysis (eg, sample size, choice of analysis method and handling of missing data).Each domain is assessed in terms of risk of bias and the concern regarding applicability to the review (first three domains only).To guide the overall domain-level judgement about whether a study is at high, low or an unclear (in the event of insufficient data in the publication to answer the corresponding question) risk of bias, subdomains within each domain include several signalling questions to help judge with bias and applicability concerns.An overall risk of bias for each individual study was defined as low risk when all domains were judged as low; and high risk of bias when one or more domains were considered as high.Studies were assigned an unclear risk of bias if one or more domains were unclear, and all other domains were low.

Open access
Data synthesis and analysis Due to significant levels of heterogeneity between studies (study design, participants, inclusion criteria) and variable reporting of items, a meta-analysis was not considered possible.As a result, a prespecified narrative synthesis approach 20 21 was undertaken, with data being summarised in tables with accompanying narrative summaries that included a description of the included variables, statistical methods and performance measures (eg, sensitivity, specificity and C-statistic (a value between 0.7 to 0.8 and >0.8 indicated good and excellent discrimination, respectively; and values <0.7 were considered weak)), 22 where applicable.All analyses were conducted using Microsoft Excel 2010 (Microsoft Corporation, Redmond, Washington, USA).

Patient and public involvement
Patients and the public were not involved in the design or conduct of this systematic review.

Study flow
Figure 1 summarises the process of identifying and selecting relevant literature.Of the 2268 citations identified, 16 studies [23][24][25][26][27][28][29][30][31][32][33][34][35][36][37][38] investigating 19 unique externally validated RAMs met the inclusion criteria.Only one of these studies 35 presented data on model development and external validation (this study used UK Clinical Practice Research Data linked to Hospital Episode Statistics to develop a risk prediction model and externally validated using Swedish medical birth registry data).The remaining studies focused on external validation with no description of the initial derivation methodology. 23-34 36-38ue to the lack of model derivation studies with external validation, we also identified and included one internal validation study for completeness (ie, prediction model development without external validation). 39This study used a bootstrap validation approach to capture optimism in model performance 40 41 when applied to similar future patients.Most of the full-text articles (n=97) were excluded primarily based on not using an RAM for predicting the risk of developing VTE during pregnancy or the puerperium, having no useable or relevant outcome data or an inappropriate study design (eg, reviews, commentaries or study protocols).A full list of excluded studies with reasons for exclusion is provided in online supplemental appendix S2.

VTE definition and case ascertainment
Only a few studies 23 27 32 36 defined the VTE endpoint (deep vein thrombosis and/or pulmonary embolism) as being confirmed by objective testing.Of the remainder, 3 studies 35 37 39 had no objective confirmation of VTE and 10 studies 24-26 28-31 33 34 38 did not report the methods for diagnosis confirmation.Although 9 studies 23 24 27 29 32-34 36 39 did not report the VTE risk period, the majority of the remaining studies used the RAMs to predict the occurrence of VTE up to 3 months after delivery. 25 28 30 31Despite differences in study design, study participants, definitions, different criteria for the use of thromboprophylaxis and differences between doses of low molecular weight heparin (LMWH), the reported overall incidence of VTE in pregnancy and the puerperium was <1.3%.

Risk of bias and applicability assessment
The overall methodological quality of the 17 included studies is summarised in table 2 and figure 2. The methodological quality of the included studies was variable, with most studies having high or unclear risk of bias in at least one item of the PROBAST.The main risk of bias limitations was related to patient selection factors (arising from retrospective data collection, 24 26 29 30 32 34 37-39 unclear exclusions/incomplete patient enrolment 24 26 27 31-34 36 38 39 or unclear criteria for patients receiving VTE prophylaxis) 23 30 35 ; predictor and outcome bias (due to a general lack of details on the definition 24-26 28-31 33 34 38 and methods of outcome determination 24 26 28-31 33 34 37-39 and whether all predictors were available at the models intended time of use 23 24 29 31 32 34 36-39 or influenced by the outcome measurement) 23-28 30-39 and analysis factors (low event rates, 23-31 33-37 39 unclear handling of missing data 23-29 31-34 36-39 and failure in reporting relevant performance measures such as calibration and discrimination). 23-34 36-38ssessment of applicability to the review question led to the majority of studies being classed either as unclear (n=13) 23 26-30 32 34-39 or high (n=4) 24 25 31 33 risk of inapplicability.These assessments were generally related to patient selection (highly selected study populations, for example, selected women at increased risk of VTE, caesarean delivery only, single disease pathologies, single site settings), predictors (inconsistency in definition, assessment or timing of predictors) and outcome determination.
Predictive performance of VTE RAMs (summary of results) Table 3 and table 4 shows the sensitivity and specificity of RAMs that were applied to antepartum women to predict antepartum or postpartum VTE or applied postpartum to predict postpartum VTE, respectively, with the results grouped by RAM.However, any meaningful comparisons between these alone is difficult, without considering the models' corresponding discrimination and calibration metrics, which were not universally reported.Only one external validation study considered model discrimination and calibration.In this study by Sultan et al, 35  and without VTE in the external Swedish cohort with a C-statistic of 0.73 (95% CI: 0.71 to 0.75), and calibration, of observed and predicted VTE risk, close to ideal (calibration slope of 1.11 (95% CI: 1.01 to 1.20)).In the remaining studies, interpretation was further limited by marked heterogeneity, which was exacerbated when different thresholds were reported by different studies evaluating the same model.In general, model accuracy was generally poor, with high sensitivity usually reflecting a threshold effect, as indicated by corresponding low specificity values (and vice versa).

Summary of results
This systematic review identified 19 externally validated RAMs (and 1 internally validated risk that aimed to predict the risk of VTE in pregnant and postpartum women and who could be selected for thromboprophylaxis.Although various risk models (based on a variety of predictor variables) are being used, most of these lacked rigorous development and evaluation.The predictive accuracy of the RAMs was highly variable, and the substantial risk of bias concerns and the general lack of methodological clarity and unclear applicability make meaningful comparisons of the evidence difficult.

Interpretation of results
Despite the development and use of various RAMs to predict the risk of developing VTE in women who are pregnant or in the puerperium (within 6 weeks postdelivery), VTE remains the leading cause of direct maternity mortality in the UK (MBRRACE-UK report 2021).Several explanations for this are possible: the risk assessment tools are inadequate; the application of these tools is incomplete or inaccurate; the underlying VTE risks of the pregnant population (increasing age, body mass index and comorbidities) are changing from when the RAMS were developed; or all three problems are operating.The use of thromboprophylaxis was reported in nine studies 23 25 28-31 33 35 36 (ranging from 3% 35 to 100% 23 28 ).This may lead to underestimation of predictive accuracy if a given RAM was to predict VTE events that were subsequently prevented by thromboprophylaxis.In the remaining studies (n=8) where thromboprophylaxis use was not reported (n=8), further analysis of its impact on the performance of the RAMs was not possible.This also suggests that the degree to which thromboprophylaxis reduces the risk of VTE in those who received it cannot be accurately estimated.Moreover, the lack of data on the predictive performance of weight-based LMWH dosing, dosage change throughout pregnancy and D-dimer testing in the included studies also precluded further analysis of its association with VTE.

Comparison to the existing literature
To our knowledge, there are no previous systematic reviews on this topic.However, recently several large registries have been interrogated in an attempt to derive robust prediction rules for this population, although with some methodological concerns.Sultan et al, 35 developed (using a large English-based registry database covering 6% of the population) and validated (using a Swedish national database registry) a risk prediction tool to estimate the absolute risk of VTE in postpartum women according to their individual risk factor combinations.Despite the low incidence of VTE in both cohorts (<0.08%), their model showed good discrimination in the external cohort and poor sensitivity at predicting those at risk of experiencing VTE.In addition, their model lacked some important VTE risk factors (eg, thrombophilia, antepartum immobilisation), and possibly underestimated the risks due to diagnosis limited to diagnostic coding (eg, varicose veins, severity of comorbidities) and the use of thromboprophylaxis in both cohorts. 42Ellis-Kahana et al, 39 also derived (using a large national database from the USA) a risk prediction model for VTE in obese pregnant women and indicated strong discrimination.However, this model still requires external validation.

Strengths and limitations
This systematic review has several strengths.It is the first systematic review to evaluate RAMs for predicting the risk of developing VTE in women during pregnant and the puerperium periods, and was conducted with robust methodology in accordance with the PRISMA statement 15 and the protocol was registered with the PROSPERO register.Clinical experts, in addition to the core review Open access Open access team, were involved and consulted throughout as advisors and to assess the validity and applicability of research findings during the review processes.
The main limitations of this study related to the observational nature of the studies reviewed and their own limitations.Most of the included risk prediction studies were retrospective cohorts.Retrospective cohort studies of large health database registries are limited by poor data quality and failure to accurately ascertain outcomes and case-control designs are prone to bias including uncontrolled confounding, temporal and selection bias. 43Conversely, better quality data may be obtained with prospective cohorts, but smaller sample sizes will lack statistical power.In addition, most of the external validation studies evaluated predictive performance of risk models that were not statistically derived (ie, without model development and internal validation).This process is vital, as risk models with only external validation may be subject to overfitting and optimism. 40Similarly, the absence of model performance measures such as calibration or discrimination hinders the full appraisal of models. 41ue to the high levels of heterogeneity between studies, we were unable to undertake any meta-analysis or statistical examination of the causes of heterogeneity due to the small number of external validation studies per risk model.Potential sources of heterogeneity include variation in study design, the study population, risk model implementation, outcome definition and measurement and the use of thromboprophylaxis.As a result, we reported descriptive statistics to provide a better understanding of the evidence base applicable to the subject matter, and shortcomings regarding reliability and validity of the data.Finally, assessments on study relevance, information gathering and validity of articles were unblinded and could potentially have been influenced by preformed opinions.However, masking is resource intensive with uncertain benefits in protecting against bias decisions. 44plications for policy, practice and future research VTE risk assessment is challenging for numerous reasons.Many risk factors for VTE are pre-existing and nonmodifiable (such as parity and inherited thrombophilia).These are then often combined with evolving risk factors which can change over the course of a pregnancy or postnatal period.Despite wide scale awareness of VTE being a major contributor to maternal mortality, numerous challenges with VTE risk stratification have been highlighted.In the UK, the MBRRACE-UK report (Saving Lives, Improving Mothers' Care 2018) 45 shows that doctors and midwives find existing risk scoring systems difficult to apply consistently in clinical practice.There is a need for development of an RAM that is simpler and more reproducible.National Institute for Health and Care Excellence guidelines on the use of thromboprophylaxis (NG89) 46 concluded that the tool described by Sultan et al 35 showed poor sensitivity compared with their prespecified target of 90% sensitivity.However, this high level of sensitivity

Open access
may not be realistic because there is evidence that only 70% of women having antenatal pulmonary embolism had any identifiable classic risk factors suggesting that sensitivity rates above 70% may not be achievable. 47In addition, a high sensitivity rate is usually associated with a lower specificity rate and the overall balance of benefits and harms may be undesirable if that means exposing a high proportion of women to thromboprophylaxis.Despite lack of evidence, many guidelines and clinical care bundles include the use of RAMs to guide VTE prophylaxis.Recently published ACOG guidelines state that most RAMs have not been validated prospectively in the obstetrical population and that current usage of such models is based on extrapolations from non-pregnant women, who differ biologically from pregnant women.The practice bulletin emphasises the need for more research to identify optimal models. 37Although further research is clearly needed the routine use of thromboprophylaxis may present a barrier to generating accurate and precise estimates of the prognostic accuracy of RAMs.Further work to improve RAMs to help stratify the risk of VTE in women who are pregnant or in the puerperium could focus on using decision-analytical modelling to compare the effects, harms and costs of giving thromboprophylaxis to patients with varying risks of VTE.This would allow determination of the risk threshold at which thromboprophylaxis provides optimal overall benefit.Subsequent work to validate these findings would require primary research.Despite the limitations of undertaking accuracy studies in populations where thromboprophylaxis is routinely used, future research could focus on selected higher risk groups who are more likely to benefit from prophylaxis and, with a higher prevalence of VTE, are more amenable to an appropriately powered prospective study.However, given the uncertain benefits and harms of VTE thromboprophylaxis during pregnancy and the postpartum period, 14 48 risk prediction studies should be undertaken alongside (or as a part of) randomised trials of prophylaxis in targeted groups deemed to be at higher risk of VTE.

CONCLUSIONS
Currently, there are a number of risk assessment models for assessing risk of VTE in pregnancy and the puerperium.Our review has shown that none of these models has been adequately validated and they have limited abilities to detect those at risk of VTE.

Figure 2 Table 3
Figure 2 PROBAST (Prediction model Risk Of Bias ASsessment Tool) assessment summary graph-review authors' judgements.

Table 1
their recalibrated novel risk prediction model (also known as the Maternity Clot Risk) provided good discrimination and was able to discriminate postpartum women with Study and population characteristics Pandor A, et al.BMJ Open 2022;12:e065892.doi:10.1136/bmjopen-2022-065892Openaccess Pandor A, et al.BMJ Open 2022;12:e065892.doi:10.1136/bmjopen-2022-065892*Retrospectivecase-control study of pregnant and postpartum women, but data reported for antepartum period only due to low number of postpartum VTE events (n=2).†Internal validation study (ie, prediction model development without external validation).‡Prospective cohort study with retrospective analysis, thus classified as retrospective cohort study.§RCOG was applied to an English derivation cohort, n=433 353, incidence, 0.07% (312 events).ACCP, American College of Chest Physicians; ACOG, American College of Obstetricians and Gynecologists; ASH, American Society of Hematology; BMI, body mass index; CC, case-control; CD, caesarean delivery; CS, cohort study; EThIG, Efficacy of Thromboprophylaxis as an Intervention during Gravidity Investigators; NR, not reported; NRS, non-randomised study; P, prospective; R, retrospective; RAM, risk assessment model; RCOG, Royal College of Obstetricians and Gynaecologists; SFOG, Swedish Society of Obstetrics and Gynecology; VTE, venous thromboembolism.