Article Text

Original research
Vitamin-K-antagonist phenprocoumon versus direct oral anticoagulants in patients with atrial fibrillation: a real-world analysis of German claims data
  1. Lisette Warkentin1,
  2. Florian Klohn2,
  3. Barthold Deiters2,
  4. Thomas Kühlein1,
  5. Susann Hueber1
  1. 1Institute of General Practice, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
  2. 2GWQ ServicePlus AG, Düsseldorf, Germany
  1. Correspondence to Dr Lisette Warkentin; lisette.warkentin{at}


Objectives Direct oral anticoagulants (DOACs) were introduced based on randomised controlled trials (RCTs) comparing them to vitamin-K-antagonist (VKA) warfarin. In Germany, almost exclusively phenprocoumon is used as VKA. RCTs with phenprocoumon being absent we analysed the benefits and harms of DOACs and phenprocoumon for patients with atrial fibrillation (AF) in a real-world setting.

Design In a retrospective observational cohort study, claims data covering inpatient and outpatient care from 2015 to 2019 were analysed by Cox regression and propensity score matching (PSM).

Setting Data from a group of small-sized to medium-sized health insurance companies in Germany.

Participants We analysed datasets of 71 961 patients with AF and first prescription of phenprocoumon (n=20 179) or DOAC in standard dose (n=51 782). Patients with reduced dose of DOACs were excluded (n=21 724).

Outcome measures Outcomes were thromboembolic events, major bleeding and death during a 12-month follow-up period.

Results The regression analysis widely showed similarity between phenprocoumon and standard dose DOACs regarding effectiveness and safety. There were only three statistically significant differences: a lower bleeding risk with composite DOACs and apixaban (HR (95% CI) = 0.67 (0.59 to 0.76) and 0.54 (0.46 to 0.63), respectively) and a higher risk of death with rivaroxaban (1.21 (1.10 to 2.34)). The analysis after PSM was consistent with the first two results regarding composite DOACs and apixaban (number needed to treat, NNT 101 and 78) and showed a lower bleeding risk with rivaroxaban (NNT 156). Absolute differences were small.

Conclusions The small superiority or non-inferiority of DOACs over warfarin seen in the RCTs might not translate into relevant advantages of DOACs over phenprocoumon. To confirm the hypothesis, an RCT with phenprocoumon is needed. Next to the safety and effectiveness assessments other factors might also play a substantial role in the decision on the right OAC for stroke prevention.

  • Cardiology
  • Stroke
  • Anticoagulation
  • GENERAL MEDICINE (see Internal Medicine)

Data availability statement

Data may be obtained from a third party and are not publicly available. Because of the confidential nature of inpatient and outpatient claims data, a permission for public availability of the data is not possible. The permission to access the data is restricted to research and subjects to the consent of the health insurance funds.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • In comparison to most randomised controlled trials, a wider selection of patients was analysed by using claims data.

  • Analyses were performed with two different methods: Cox regression and propensity score matching.

  • We adjusted for several clinical and sociodemographic factors available in the data, however, residual confounding is possible.

  • Due to data structure, some clinical information is missing.


Atrial fibrillation (AF) is the most common arrhythmia.1 Traditionally vitamin-K-antagonists (VKA) have been the standard for stroke prevention in patients with AF. The trend towards prescribing direct oral anticoagulants (DOACs) instead started with the drug approval of the first DOAC dabigatran etexilate. In 2016, there were already more prescriptions for DOACs than for VKA in Germany.2 Until today, four different DOACs are approved in Germany: dabigatran, rivaroxaban, apixaban and edoxaban. According to the pivotal trials, DOACs are partially associated with statistically significant although small risk reductions or non-inferiority as compared with warfarin.3–6

When applying these results to decisions in daily clinical practice, several limitations hold true. The study populations in randomised controlled trials (RCTs) are often not representative for the real-world population in terms of age, comorbidities and therapy adherence.7 In all pivotal RCTs of DOACs, warfarin was used as the VKA comparator drug. However, in some countries, other VKA are predominant. For example, in Germany almost exclusively phenprocoumon is used as a VKA.8 It principally seems inappropriate to transfer minor advantages of new drugs like DOACs over warfarin to a third one like phenprocoumon, which has different pharmacological properties like a longer half-life.9 In observational studies with claims data from Germany, the small differences in favour of DOACs over warfarin partially diminished or even disappeared when comparing DOACs to phenprocoumon instead of warfarin.10–12 However, with slightly different study designs (ie, differences in statistical analyses, outcome definition and/or data source) the results of these real-world studies were also ambiguous. A recent analysis comparing phenprocoumon to low-dose DOACs revealed that phenprocoumon was associated with fewer thromboembolic events and deaths and a non-significant higher bleeding risk than low-dose DOACs.13 Low-dose therapy is mainly used for patients with certain pre-existing medical conditions or risks. RCTs comparing DOACs with phenprocoumon being largely absent, real-world studies form an even more important complement to the evidence. A comparison of all four DOACs in standard dose with phenprocoumon has not yet been made. Therefore, the aim of this study was to add to this heterogeneous evidence by analysing the effectiveness and safety of oral anticoagulation therapy regarding the prevention of thromboembolic and bleeding events in patients with AF treated with phenprocoumon as compared with standard dose DOACs in a real-life setting.


In a retrospective observational cohort study, data of several small-sized to medium-sized German health insurance companies were analysed. The data were provided by the Corporation for Efficiency and Quality in Health Insurance (GWQ ServicePlus AG - Gesellschaft für Wirtschaftlichkeit und Qualität bei Krankenkassen). All methods were carried out in accordance with relevant guidelines and regulations. The reporting of the study is based on the German Good Practice Secondary Data Analysis14 and the REporting of studies Conducted using Observational Routinely-collected health Data Statement.15

Data and study population

Data from the years 2014 to 2019 were analysed. The dataset included information (e.g. age, sex, diagnoses and diagnostic/therapeutic procedures) from outpatient and inpatient care. Patients with a first prescription of an OAC in 2015–2018, defined as the index date, aged 18 years and older with a diagnosis of AF during 12 months of preindex period were included in the analysis. In Germany, International Classification of Diseases, version 10 (ICD-10) codes have an additional modifier which indicates whether the diagnosis is assured, suspected, ruled out or status post. Outpatient diagnoses have only been considered if they were coded as assured. Patients with OAC in the 12 months before observation were excluded. Patients receiving more than one OAC or different doses at index date, patients with less than 12 months of follow-up time (continuous insurance status), with undefined age/sex or warfarin as VKA were excluded. Other VKA than warfarin and phenprocoumon were not prescribed. In case of death during the observation period, there was no minimum follow-up time. For the survivors, continuous insurance status was defined as being insured in the beginning and end of the observation period and having at least one observable insurance day in each observable quarter. Claims data that could not be linked to patients due to bad coding were corrected as far as possible with an internal mapping algorithm. The whole selection process is shown in figure 1.

Figure 1

Data selection process and sample sizes of DOAC subgroups. ATC, Anatomical Therapeutic Chemical; DOAC, direct oral anticoagulant; ICD, International Classification of Diseases, version 10.

Only patients with DOACs in standard dose were considered. Standard dose was defined according to the standard dose suggested for stroke prevention in patients with AF by the respective Summaries of Product Information of the DOACs.16–19 Patients with a lower dose, for example, due to renal impairment were excluded and analysed separately.13 Dose assignment for DOACs was operated by the Pharmacy-Central-Number (identification number for pharmaceutical products in Germany) (online supplemental table S1).

Outcome measures

Effectiveness outcomes were hospitalisation due to thromboembolic events, including ischaemic stroke, non-specified stroke, transient ischaemic attack and mesenteric ischaemia. Safety outcomes were defined as hospitalisation due to bleeding in critical areas or organs, like intracranial bleeding and other bleeding which led to blood transfusion (ICD-10-codes in online supplemental table S2). Selection was based on the definition for major bleeding by the International Society on Thrombosis and Haemostasis (ISTH).20 The third outcome was death of any cause, operated by death as reason for deregistration from health insurance. Observation period was 12 months beginning with the date of first prescription. Next to the definition by the ISTH, the definition of outcomes was based on inpatient diagnoses or diagnoses in combination with treatment to attain internal validity.21

Statistical analysis

Phenprocoumon was compared with five DOAC groups (composite of all DOACs, apixaban, dabigatran, edoxaban, rivaroxaban). In the following analyses, event rates were censored for death and switch in medication and/or dose. Only the first event per patient was considered for each outcome when calculating the event rates. Rates are reported per 100 patient years.

To widely address confounding, comparisons were performed independently using Cox regression and propensity score matching (PSM) to countercheck the results, respectively.

Cox regression models were applied to estimate effectiveness and safety of treatments with adjusted cause specific hazard ratios. The following covariates were considered to reduce confounding: (1) age and sex; (2) comorbidities, for example, arterial hypertension, cachexia and renal failure; (3) comedication, for example, antiarrhythmic medication, antiplatelets (ICD-10-codes/Anatomical Therapeutic Chemical codes in online supplemental table S3); (4) CHA2DS2-VASc-Scores, calculated based on the information in the data (sex not included, as it is considered separately); (5) Charlson Comorbidity Index22 (6) effectiveness and safety outcomes occurred before index date; (7) dummy variables for each year/quarter of the index date. To test whether the treatment can be properly distinguished from the confounders, multicollinearity tests were applied.

Sensitivity analysis

DOACs and phenprocoumon were compared regarding effectiveness and safety after PSM for sensitivity analysis. For PSM the same covariates were used as in the Cox regressions except for the dummies for each year and quarter. Under the condition that the maximum standardised mean difference (SMD) between the DOACs and the phenprocoumon group had to be <0.1 for all confounders, a 1:1 nearest neighbour matching without replacement was performed. Via logistic regression, with DOAC patients used as binary dependent variables in the first stage of the matching process, five cohort-pairs were formed. The SMDs for each covariate, a description of the cohorts after matching and the number of patients for whom no matching partner was found are depicted in online supplemental table S4.

After PSM, two-sample tests for equality of proportions with continuity correction were performed for each outcome. Absolute and relative risk reductions as well as numbers needed to treat were calculated. P values were adjusted with Bonferroni correction (n=15 tests) separately for Cox regression and sensitivity analysis to counteract the problem of multiple testing. Therefore, an adjusted two-sided p<0.003 was considered significant. Statistical analyses were performed using the R Statistical Software (V.3.6.1, R Foundation for Statistical Computing, Vienna, Austria).

Patient and public involvement

Patients and the public were not involved in the project.


Baseline characteristics and unadjusted outcome rates

After exclusions according to the predefined criteria, a total of 71 961 datasets were included in the analysis. The phenprocoumon cohort comprised 20 179 patients, the DOAC cohort 51 782 patients. The baseline characteristics are reported as proportion or mean with SD in table 1.

Table 1

Baseline characteristics of phenprocoumon, composite DOAC cohort and the single DOAC subgroups with standardised mean differences (SMD) before matching

The SMDs of the comparison of the phenprocoumon and DOAC cohorts before matching indicate differences between the groups in almost all variables. Prominent differences were seen regarding severe chronic renal impairment and intake of heparin in the comparison of the phenprocoumon and DOAC composite cohort (as well as apixaban). Other variables with noticeable SMD values in the comparison with the phenprocoumon cohort are age (dabigatran, rivaroxaban), CHA2DS2-VASc-Score (edoxaban, rivaroxaban), moderate chronic renal impairment (dabigatran) and renal impairment (total) (dabigatran, edoxaban, rivaroxaban). All of the SMDs named above are negative. This means the mean age and CHA2DS2-VASc-Score in the phenprocoumon group is higher, patients receive more heparin and renal impairment is more frequent than in the respective DOAC (sub)groups.

Before matching, event rates per 100-patient years showed more deaths and bleeding in the phenprocoumon cohort than in the DOAC composite cohort as well as in the subgroups. Regarding thromboembolic events the differences were less noticeable (table 2). The mean-follow up times are depicted in online supplemental table S5.

Table 2

Event rates per 100 patient-years (py) for the outcomes before and after propensity-score matching

Effectiveness and safety

Regarding the comparison of phenprocoumon with the composite DOAC subgroup, the Cox regression showed similarity in risk of death, a statistically significant lower bleeding risk and a non-significant trend of more thromboembolic events with DOACs (HR with 95% CI for thromboembolic events: 1.13 (0.99 to 1.28), p<0.064; death: 1.04 (0.95 to 1.13), p<0.384; bleeding: 0.67 (0.59 to 0.76), p<0.001).

The strongest difference towards a lower risk of thromboembolic events with phenprocoumon compared with the single DOACs was seen when comparing phenprocoumon to dabigatran, however, still not statistically significant after Bonferroni correction.

Rivaroxaban was associated with a statistically significant higher risk of death than phenprocoumon (HR with 95% CI=1.21 (1.10 to 2.34), p<0.001).

Apixaban was associated with a statistically significant lower bleeding risk as compared with phenprocoumon (HR with 95% CI=0.54 (0.46 to 0.63), p<0.001). For the other cohorts a non-significant trend was shown. The results are depicted in figure 2. HRs for all covariates are depicted in online supplemental figure S1.

Figure 2

Cox proportional hazard regression model for the comparison of DOAC and phenprocoumon regarding outcomes thromboembolic events, death and bleeding. Adjusted HRs with 95% CI and p value adjusted with Bonferroni correction (adjusted p<0.003, n=15). DOAC, direct oral anticoagulant.

Sensitivity analysis

After PSM, the comparisons of phenprocoumon to all DOAC subgroups showed no significant differences regarding thromboembolic events and deaths. Composite DOACs, apixaban and rivaroxaban were associated with a statistically significant lower bleeding risk (numbers needed to treat: 101, 78 and 156, respectively). A statistically significant difference between rivaroxaban and phenprocoumon concerning death was shown only in the Cox regression analysis and for bleeding only after PSM. All other results showed consistency in analysis with the PSM and Cox regression (table 3). The event rates per 100 patient-years after matching are shown in table 2.

Table 3

Results of analysis of effectiveness and safety of direct oral anticoagulants (DOAC) versus phenprocoumon


Overall, only minor differences between the outcomes of patients with AF treated with DOACs and phenprocoumon for stroke prevention were seen. Statistically significant differences between the composite DOACs subgroup and phenprocoumon regarding safety and effectiveness were found only for the outcome major bleeding favouring DOACs. When comparing the other DOAC subgroups with phenprocoumon, the only result being statistically significant in both analyses—Cox regression and after PSM—was the association of apixaban with a lower bleeding risk (absolute risk reduction, ARR 1.3%, number needed to treat, NNT 78). The associations of a higher risk of death and a lower risk of bleeding with rivaroxaban were only significant in one of the two analyses performed. However, though partly statistically significant, differences in effectiveness and safety between phenprocoumon and DOACs seem to be very small if it comes to absolute risk differences.

The comparison of the results of RCTs with DOACs and warfarin and real-world studies with DOACs and phenprocoumon is problematic. Therefore, in the following, our results will first be discussed according to the results of four other real-world studies comparing DOACs with phenprocoumon.10–12 23 In a second step, we cautiously discuss the question why the results of RCTs with warfarin might deliver partially different results than real-world studies with phenprocoumon.

Comparison to other real-world studies

Effectiveness: In line with our results, three other real-world studies reported no difference, or even a beneficial effect favouring phenprocoumon, regarding thromboembolic events.10–12 One exception was seen in Hohnloser et al reporting a slightly significant risk reduction of ischaemic stroke in case of apixaban.23 The composite effectiveness outcome in that study indicated an even more pronounced benefit of DOACs. One explanation might be that Hohnloser et al included intracranial haemorrhage, an outcome that seems to occur less frequently in patients with DOACs, to the composite effectiveness outcome. The major RCTs comparing DOACs to warfarin handled this the same way. Our study and the other three observational study included intracranial bleedings only in the safety outcomes which we perceive as more suitable.

Safety: All studies but Mueller et al reported a (small) benefit of DOACs in terms of major bleeding.10–12 23 Mueller et al indicated a superiority of phenprocoumon regarding bleeding which they defined as a composite of gastrointestinal and intracranial bleeding.12 An explanation for this difference in safety evaluation might be that the other studies included other forms of extracranial bleedings in the composite outcome. The main benefit of DOACs seems to be in intracranial haemorrhage.11 23 However, when analysing all major bleeding events, the difference between DOAC and phenprocoumon with an ARR of 1% and an NNT of 101 seems very small and its clinical relevance might be questioned.

Regarding death, the four studies showed ambiguous results10–12 23 and only one was similar to our results.11 Although all of these studies are observational studies analysing routine healthcare data, the study populations differ in terms of for example age, gender and CHA2DS2-VASc-Score. Also, some real-world studies included DOAC patients with standard-dose and low-dose regimens to their study population and did not analyse those groups separately. These factors could have had an impact on the results.

Comparison to RCTs with DOACs and warfarin

The results of RCTs comparing DOACs with warfarin partially differ from observational studies comparing DOACs with phenprocoumon.3–6 10–12 23 Next to differences in study designs, the fact that different VKA are being compared could lead to these differences. Although the two VKA warfarin and phenprocoumon resemble each other in their mechanism of action, they differ in many pharmacological properties.9 To achieve a safe and efficient OAC therapy with VKA, the patients’ time in therapeutic range (TTR) is essential. While the median TTR in the pivotal DOAC vs warfarin studies varied between 58% and 68.4%,3–6 data from the PREFER IN AF study showed that patients with phenprocoumon were more often in therapeutic range (79%), possibly due to a longer half-life and/or to more intensive supervision.8 Therefore, when comparing phenprocoumon instead of warfarin with DOACs, it might not be surprising that the small differences in favour of DOACs regarding safety and efficacy partially diminish or even reverse.

Limitations and strengths

When interpreting the results of real-world studies, several limitations have to be considered. Regression analysis and matching have to be based on the information available in the data. As claims data are not perfectly accurate and coding of diagnoses or other essential variables can be missing, for example, smoking or obesity,24 a residual confounding remains. Thus, despite successful PSM, VKA and DOAC cohorts may differ. Data about laboratory results like the international normalised ratio were missing, so that the effect of the TTR on safety and effectiveness of VKA therapy could not be calculated and cannot be taken into consideration for result interpretation. Using claims data of different insurance companies, a large sample size could be generated which probably is representative for Germany and reflects the actual healthcare situation. Of course, the scientific rigour and quality of an RCT is higher than of a retrospective cohort study design based on claims data. Probably both study types will be needed to evaluate effectiveness and safety of new drugs as compared with older ones. Real-world studies have other advantages as they avoid the restrictive selection of patients and the highly controlled setting of an RCT which will differ from the real world.7 25 By method, our analysis includes patients independently of factors like TTR, grade of adherence or appropriate dosing. Factors like these can have an impact on patients’ outcomes.26 27 However, real-world data from observational studies provide relevant information on the effectiveness of medication, complementing the data generated in clinical trials. Our objective was to evaluate the safety and effectiveness in a real-world setting with all deviances from standard or recommended care. By including all patients, our study results provide evidence on the safety and effectiveness of OAC in the actual healthcare provision in Germany.


This study adds to the real-world evidence that there might be only small if any differences between phenprocoumon and DOACs. The clinical relevance of the few differences seems questionable. Given the limited scientific rigour of real-world studies and the growing prescribing rates of DOACs, an RCT comparing DOACs to phenprocoumon is urgently needed. As phenprocoumon and DOACs might not differ substantially in terms of safety and effectiveness, further research should also focus on the impact of other factors like human and financial resources, patients’ preferences and comorbidities on the decision.

Data availability statement

Data may be obtained from a third party and are not publicly available. Because of the confidential nature of inpatient and outpatient claims data, a permission for public availability of the data is not possible. The permission to access the data is restricted to research and subjects to the consent of the health insurance funds.

Ethics statements

Patient consent for publication

Ethics approval

As German law allows for analysing anonymous data for research purposes without patients’ consent, no ethical approval was required.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • TK and SH are joint senior authors.

  • Contributors All authors designed the study. Researchers at the Institute of General Practice at the Universitätsklinikum Erlangen (LW, TK and SH) defined the outcomes. FK and BD (GWQ ServicePlus AG) provided and analysed the data. FK acted as guarantor. The statistical methodology was determined after discussion among all researchers. Calculations were conducted by the researchers of the GWQ ServicePlus AG. The researchers of the Institute of General Practice drafted the manuscript. All authors revised it critically and approved the final version for publication.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests BD and FK are employees of the GWQ ServicePlus AG owned by insurance companies. There are no other conflicts of interest to declare. There was no financial interaction whatsoever between the GWQ ServicePlus AG and the Institute of General Practice nor will it be.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Author note The present work was performed in (partial) fulfillment of the requirements for obtaining the degree „Dr. rer. biol. hum."

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.