Introduction Clinical prediction rules have been validated and widely used in patients with atrial fibrillation (AF) to predict stroke and major bleeding. However, these prediction rules were not developed in the same population, and do not provide the key information that patients and prescribers need at the time anticoagulants are being considered—what is the individual patient-specific risk of both benefit (decreased stroke) and harm (increased major bleeding). In this study, our primary objective is to develop and validate a prediction model for patients’ individual combined benefit and harm outcomes (stroke, major bleeding and neither event) with and without warfarin therapy. Our secondary outcome is all-cause mortality.
Methods and analysis We will use data from the Kaiser Permanente Colorado (KPCO) anticoagulation management databases and electronic medical records. Patients with a primary or secondary diagnosis during an ambulatory KPCO medical office visit, emergency department visit, or inpatient stay between 1 January 2005 and 31 December 2012 with no AF diagnosis in the previous 180 days will be included. Patients’ demographic characteristics, laboratory data, comorbidities, warfarin medication data and concurrent use of medication will be used to construct the prediction model. For primary outcomes (stroke with no major bleeding, and major bleeding with no stroke), we will perform polytomous logistic regression to develop a prediction model for patients’ individual combined benefit and harm outcomes, taking neither event group as the reference group. As regards death, we will use Cox proportional hazards regression analysis to build a prediction model for all-cause mortality.
Ethics and dissemination This study has been approved by the KPCO Institutional Review Board and the Hamilton Integrated Research Ethics Board. Results from this study will be published in a peer-reviewed journal electronically and in print. The prediction models may aid in patient-physician shared decision-making when they are considering warfarin therapy.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Strengths and limitations of this study
The prediction model can provide comprehensive information on the individual combined benefit and harm with and without warfarin for each patient with atrial fibrillation.
The prediction model may aid in patient-physician shared decision-making when they are considering warfarin therapy.
Rigorous statistical analyses are performed for model construction and assessment.
Potential limitation includes the data accuracy of the administrative databases used in this study.
Atrial fibrillation (AF) is the most common sustained cardiac dysrhythmia. The presence of AF is a strong and independent risk factor for stroke1 with an approximate fivefold excess risk,2 and for mortality with a doubled death rate.3 Antithrombotic therapies such as oral warfarin are now the mainstay for stroke prevention and recommended in guidelines for patients with AF.1 ,3–5
Warfarin is impressively efficacious in preventing stroke and death. A recent meta-analysis of randomised controlled trials concluded that warfarin reduced the risk of stroke and mortality by 64% and 26%, respectively, compared with placebo or no treatment.6 Several new oral anticoagulants including dabigatran, rivaroxaban and apixaban are now available. Their use is increasing, primarily because they do not require routine anticoagulation intensity monitoring such as the international normalised ratio (INR).7 At present, while the evidence, especially real-world clinical practice evidence, is evolving for these new drugs, warfarin remains the dominant oral anticoagulant.8 For all anticoagulants, the most important and worrisome adverse event is major bleeding especially intracranial haemorrhage (ICH).5 ,9 ,10 Fear of bleeding, risk proclivity regarding stroke, and antipathy about taking an additional medication and undergoing blood tests has led to underuse by patients with AF who are qualified candidates for warfarin therapy.11–15
Clinical prediction rules have been validated in patients with AF to predict stroke and bleeding (JA Pereira. Methods to predict individualized combined benefit/harm patient profiles for warfarin. Graduate Department of Pharmaceutical Sciences [unpublished doctoral dissertation] Toronto, Canada: University of Toronto, 2008). The CHADS2 (Congestive heart failure, Hypertension, Age >75 years, Diabetes, Previous Stroke (2 points)) and the CHA2DS2-VASc scores are recommended (Congestive heart failure; Hypertension; Age ≥75 years (2 points); Diabetes mellitus; Stroke (2 points); Vascular disease, Age 65–74 years and Sex (female)) to predict the risk of stroke.3 ,5 ,16 ,17 For prediction of major bleeding with warfarin, the HAS-BLED score (Hypertension; Abnormal renal/liver function; Stroke; Bleeding history or predisposition; Labile INR; Elderly ( >65 years); Drugs/alcohol concomitantly) has been validated.3 ,5 ,18–21
Despite their usefulness as guides, the classification schemes above were not developed in the same population, and do not provide the key information that patients and prescribers need at the time anticoagulants are being considered—what is the individual patient-specific risk of both benefit (decreased stroke) and harm (increased bleeding). A prediction rule to assess the probabilities of both stroke and major bleeding simultaneously in the same population is required. Some studies have focused on combined benefit and harm profiles of warfarin versus no warfarin for individual patients with newly diagnosed AF. For example, several studies have used a ‘net benefit’ approach for warfarin which took into account the main benefit (reduced risk of stroke) and the main harm (increased risk of bleeding) in the same population.22–24 However, the weighting factor reflecting relative impact for calculation (ie, net benefit=(TE rateoff warfarin−TE rateon warfarin)−weight×(ICH rateon warfarin−ICH rateoff warfarin), where TE denotes ischaemic stroke or systemic embolism) was chosen arbitrarily.22–24 The chosen weight of 1.5 for ICH assumes that patients with AF would weigh ICH as 50% worse than TE. These studies failed to give careful consideration to actual data on patient values and preferences,25–27 and did not consider the much more common gastrointestinal (GI) bleeding.
Considerations of rates of benefit and harm in patients with AF must also consider mortality, since these are typically older patients and mortality is high. For instance, in an American community-based cohort, the death rate in a population with AF was over 60% during a mean follow-up of 5.3 years.28 Therefore, death is essentially a competing risk of stroke and major bleeding. However, the existing risk stratification schemes do not deal with the death as a competing risk.16–18 ,29 Given that the three non-lethal outcomes associated with AF (‘no major bleeding and no stroke’, ‘stroke’ and ‘major bleeding’) are not mutually exclusive, and patients deserve to have this critical information on their individual risk of benefit and harm at the time they are considering whether to take warfarin or not, it is imperative to derive valid predictions for and calculate the probabilities of individualised combined benefit and harm outcomes of warfarin. Furthermore, methods of considering mortality as a competing risk with stroke and major bleeding are needed.
In this study, we will complete the development and external validation of a prediction model for each patient's individual combined benefit and harm outcomes (stroke with no major bleeding, major bleeding with no stroke and neither event) with and without warfarin therapy, based on the Kaiser Permanente Colorado (KPCO) anticoagulation management databases and electronic medical records. Our secondary objective is to devise a model to predict all-cause mortality.
Patients and settings
KPCO is a geographic section of a large national non-profit group model Health Maintenance Organization. KPCO is an integrated healthcare delivery system providing medical care to approximately 550 000 patients in the Denver-Boulder metropolitan area of Colorado in the USA.30 Patients who are members of KPCO are offered anticoagulation services by a centralised Clinical Pharmacy Anticoagulation Service (CPAS) (JA Pereira. Unpublished doctoral dissertation, 2008). CPAS clinical pharmacists initiate, adjust and refill warfarin and order relevant laboratory tests for patients, working in collaboration with the referring physicians and applying standardised dosing algorithms.30 In this study, data will also be obtained for patients with AF in KPCO who are not taking warfarin or managed by CPAS.
New diagnoses of AF will be determined using International Classification of Disease, Ninth Revision, Clinical Modification (ICD-9-CM) diagnosis codes 427.31 and 427.32 from the KPCO Virtual Data Warehouse (VDW) Diagnosis Database. The codes can be recorded either as primary or secondary diagnosis during an ambulatory KPCO medical office visit, emergency department (ED) visit or inpatient stay between 1 January 2005 and 31 December 2012 with no AF diagnosis in the previous 180 days. Patients with their continuous KPCO membership <180 days prior to AF diagnosis, or aged <18 years, will be excluded. Warfarin non-users will be excluded if they die before their assigned index date (defined below), and warfarin users will be excluded if they have a purchase of warfarin during the 180 days prior to their AF diagnosis or a supply from a warfarin purchase that overlaps into the 180 days.
Patients newly diagnosed with AF from 1 January 2005 to 31 December 2008 (KPCO-I, derivation cohort) will be used to construct the prediction model cohort, while patients with a new AF diagnosis from 1 January 2009 to 31 December 2012 (KPCO-II, validation cohort) will be used as external validation of the prediction models.31 These are both 4-year blocks of time.
We define study start date as the date of AF diagnosis for each patient. Study diagnosis end date is defined as 31 December 2008 and 31 December 2012 for the derivation and validation cohort, respectively. Study outcome end date is 30 June 2009 and 30 June 2013 for the derivation and validation cohort, respectively. During the time period from study start date to study outcome end date, patients who have no less than one purchase of warfarin will be categorised into warfarin users, while patients with no purchase of warfarin will be considered warfarin non-users.
Given that warfarin users may not take warfarin immediately after the diagnosis of AF (ie, study start date), there is immortal time bias in favour of warfarin.32 That is, warfarin users who do not take warfarin initially after diagnoses as AF have to be ‘immortal’ before their inception of warfarin, which would provide warfarin users with an artificial survival advantage over those never on warfarin and thus overestimate the benefit of warfarin.32 ,33 To control for immortal time bias, for warfarin users their study index date will be determined as the first date of warfarin purchase after their AF diagnosis. Subsequently, warfarin non-users will be assigned an index date that corresponds to the length of time after the study start date to the index date for their randomly matched (on year of AF diagnosis) warfarin user.34 Warfarin non-users who die before their assigned index date will be excluded from the analysis. The length of time from study start date to the index date for warfarin users will be restricted to 180 days, in order to control the skewness of the distribution and achieve maximum matching.34 Therefore, warfarin users whose length of time from study start day to index date exceeds 180 days will also be excluded.
All included patients will be followed up after the index date until death, termination from the KPCO system or the study outcome end date (30 June 2009 for KPCO-I, and 30 June 2013 for KPCO-II), whichever occurs first.
The events of interest, including stroke, major bleeding and death, will be identified after the index date until study outcome end date. For the primary objective, all the patients will be categorised as one of the three combined outcome groups: stroke with no major bleeding, major bleeding with no stroke or neither event. For the secondary objective, patients will be grouped into either survival group or non-survival group.
Some patients may have a diagnosis of stroke and/or major bleeding predating their index date. Stroke and/or bleeding before the index date will be considered as a risk factor to reflect comorbidity (ie, prior stroke, bleeding) rather than as an end point event of interest in this study. Meanwhile, there may be some patients experiencing both a stroke and major bleeding chronologically during follow-up. We choose the event that happens first as our outcome, and categorise the patient into the corresponding group, in order to include as many outcome events as possible.
All stroke and major bleeding events will be administratively identified from the VDW Diagnosis Database recorded in an ambulatory KPCO medical office visit, ED visit or inpatient stay after the index date until outcome end date using ICD-9-CM codes. Table 1 shows the ICD-9-CM codes for stroke and major bleeding used in this study. Stroke events will be identified based on ICD-9-CM codes 433.xx, 434.xx, 436.xx in this study, in which the codes have been validated in a KPCO study with high positive predictive values.35 Bleeding which results in an ED visit requiring a transfusion, or an admission to hospital will be considered as major bleeding.36 We will not identify the major bleeding that causes a fall in haemoglobin level of at least 20 g/L but does not require a transfusion,37 because no data on haemoglobin for the patients during follow-up are available in this study. All ICH will be counted as major bleeding rather than strokes. Information on deaths for included patients will be obtained from the VDW Death Database.
All patients’ demographic factors, laboratory data, comorbidities, warfarin medication data and concurrent use of medication will be retrieved from the administrative KPCO databases and patients’ electronic medical record. The KPCO databases link patients’ pharmacy profiles, such that patients’ medications information including warfarin-related data can be accessed. Approximately 94% of KPCO prescriptions are purchased at in-house pharmacies.
Specifically, in this study, demographic data include patients’ gender and baseline age. Laboratory data include INR, haemoglobin, serum creatinine and albumin, where all the measures are most proximal to but before the index date.
For comorbidities, data recorded in an ambulatory KPCO medical office visit in the 180 days prior to the index date will be obtained. We will retrieve the comorbidity data including the components of the CHA2DS2-VASc and HAS-BLED schemes, as well as the comorbidities included in the Charlson Comorbidity Index,38 in order to obtain a comprehensive list of comorbidities as potential independent variables. Concretely, data on comorbidities include the presence of congestive heart failure, hypertension, diabetes, prior stroke/transient ischemic attack, myocardial infarction, peripheral vascular disease, renal disease, liver disease, prior major bleeding (including GI bleeding and ICH), concomitant use of antiplatelets or non-steroidal anti-inflammatory drugs (NSAIDs), alcohol abuse, other cerebrovascular disease, dementia, peptic ulcer disease, chronic pulmonary disease, rheumatic disease, AIDS, hemiplegia or paraplegia, and any malignancy (including lymphoma and leukaemia and metastatic solid tumour, except malignant neoplasm of skin). Table 2 shows the ICD-9-CM codes for the comorbidities analysed in this study.
Warfarin medication data include the presence of sold prescriptions for warfarin from a KPCO pharmacy between the index date and outcome end date. Length of time in days from study start date to the purchase date will be recorded. In addition, data on each sold warfarin prescription's length of time from index date and days of medication supplied will be obtained.
For concurrent medication data, the presence of a sold prescription for the purchases during the 90 days after the index date will be recorded. Our analysis will be restricted to those concomitant medications for which there is evidence of an interaction that potentiates or inhibits the effect of warfarin, based on the systematic review of interactions of warfarin with other drugs.39 ,40 Table 3 displays the complete medication list used in this study, which includes other anticoagulants, antiplatelets, NSAIDs and selected representatives of other families including anti-infective agents, cardiac drugs, central nervous system drugs, GI drugs, etc.
All data will be examined on a descriptive basis and presented as the mean±SD for continuous variables, and frequency and percentages for categorical variables. Student t test will be used to compare continuous variables, and χ2 test will be applied for categorical variables between warfarin users and non-users. Unless otherwise specified, all statistical tests will be two-sided using an α level of 0.05.
Since there are three multinomial levels for the primary outcomes (stroke with no major bleeding, major bleeding with no stroke or neither event), we will use polytomous logistic regression (PLR) to develop a prediction model for patients’ individual combined benefit and harm outcomes. Using neither event group as the reference group, two models will be constructed to predict stroke with no major bleeding, and major bleeding with no stroke, respectively. As regards death, we will use Cox proportional hazards regression analysis to build a prediction model for all-cause mortality.
For logistic regression, a fitted model is likely to be reliable and stable when the number of participants with the outcome (ie, either stroke with no bleeding, or bleeding with no stroke, or bleeding with stroke) is 10–15 times the number of predictor variables.41 ,42 We anticipate that about 10 predictors will be included into the PLR model maximally; therefore 100 stroke with no major bleeding, and 100 major bleeding with no stroke will be required to devise the PLR models in the derivation cohort.
Face validity of the KPCO data
We will show the trend of the incidence rates of stroke in KPCO-I and KPCO-II stratified by CHA2DS2-VASc score, and the incidence rates of major bleeding stratified by HAS-BLED score, which can judge the validity of the KPCO cohort for patients with AF by ensuring that it conforms to previously validated clinical prediction rules.
For primary outcomes in patients with AF in the KPCO-I cohort, first, because predictors that are highly correlated with others contribute little independent information, pruning candidate predictors is required.43 The effect of multicollinearity between predictors would inflate the values of the SEs of the coefficients in the model, which may drive some predictors away from statistical significance. To avoid this, the variance inflation factor with a threshold of 4 will be chosen to determine whether predictors are redundant and highly correlated.44 Subsequently, for each pair of patients (ie, stroke with no major bleeding vs neither event, major bleeding with no stroke vs neither event), univariate logistic regression will be performed first before selecting significant variables for multivariable regression, where the α level of 0.20 will be chosen to ensure χ2 measures include all possible predictors. After taking multicollinearity into account and selecting significant variables based on univariate logistic regression, PLR will be used to build the prediction models.
To investigate whether warfarin can modify the effect of other predictors in the PLR models on stroke and major bleeding, all the two-way interactions between warfarin status (ie, with or without warfarin) and other predictors will be tested. Significant interactions with a priori α value of ≤0.05 between warfarin and other predictors will be retained and added into the prediction models. Moreover, given the importance and potential interactions of the predictors composing the CHA2DS2-VASc and HAS-BLED schemes (eg, sex, age, hypertension), we will also evaluate the two-way interactions along with their main effect terms if they are kept in our PLR models.45 Then the significant interactions with an α value of ≤0.05 will be included to update and finalise the PLR models. For instance, if hypertension and age are included in our PLR models, we will also test the significance of their two-way interaction (hypertension×age) before choosing this interaction into the finalised PLR models.
For secondary outcome, Cox regression model will be applied to building the prediction model of death. Similar procedures to those for primary outcomes will be followed, that is, choose the variables without multicollinearity and those significant predictors in univariate analysis to make up the model, then include the significant interactions (warfarin×other predictors, two-way interactions of the predictors composing the CHA2DS2-VASc and HAS-BLED schemes) to finalise the model to predict death. A statistical test of proportional hazards assumption and a graphical examination using Schoenfeld residuals will be carried out to assess the proportional hazards assumption.46
For missing data, if <10% of observations on a variable are missing, the mean or median of the variable in its group will be used for imputation. If no less than 10% of data are missing, assuming they are missing at random, multiple imputations will be performed using clinical judgement to identify factors to be included in the imputation model.47 ,48 If multiple imputations are used, as a sensitivity analysis, the obtained PLR results will be compared with the original PLR models with missing data.
Since there may be gaps in the consumption of warfarin for the patients during follow-up, another sensitivity analysis will be conducted using warfarin as a time-dependent covariate, to investigate whether the effect of warfarin is robust on stroke and major bleeding in the PLR model, and death in the Cox model, respectively.49 We will use the gap of >30 days between the last day when a previous purchase is expected to run out and the first day of the next purchase, to indicate warfarin discontinuation for warfarin users.
Moreover, to investigate whether the predictors are sensitive after taking all-cause death (as a competing risk of stroke and major bleeding) into account, we will perform a competing risk analysis to obtain the hazard functions for stroke and major bleeding separately. Two proportional subdistribution hazards models of the Fine and Gray method50 will be constructed for stroke and major bleeding, respectively. In the competing risk analysis, patients who die ahead of an event of stroke or major bleeding will be left in the risk set with decreasing weight to account for declining observability, rather than being treated as simple censoring.50 Predictors and their coefficients in the proportional subdistribution hazards models will be used to compare with those in the PLR models.
To assess calibration of the PLR models for primary outcomes, we will compare the predicted risks of stroke with no major bleeding, and major bleeding with no stroke to the observed event rates in different deciles of predicted risks.51 ,52 Differences between predicted and observed event rates will be used to calculate a Hosmer-Lemeshow statistic, where a non-significant result indicates no evidence of lack of fit to the data. To assess discriminability, we will calculate the area under the two receiver operating characteristic curves (AUC) for each pair of comparison: stroke with no major bleeding versus neither event, and major bleeding with no stroke versus neither event.
Regarding the Cox model for all-cause death, we will evaluate the model calibration by comparing the predicted risk of death and observed rates across each 10th of the observed risk,51 ,52 where the observed risk will be calculated using the Kaplan-Meier product-limit estimate. Goodness-of-fit of the model will be investigated using a Gronnesby and Borgan test with 10 groups according to the predicted risk score, in which a non-significant result implies the model is a good fit.53 Harrell’s C index will be calculated to assess the discriminability of the model.54
For a typical newly diagnosed patient with AF, we will input his/her individual information into the PLR models and calculate the probability of stroke with no major bleeding, major bleeding with no stroke and neither event, respectively.55 We can also calculate his/her probability of death using the Cox regression model.
As internal validation of the PLR models, a 10-fold cross-validation and a bootstrap analysis resampling 1000 times with replacement will be conducted to assess the models’ validation. The AUCs of the original PLR models will be compared with those of cross-validation, while coefficients from the original PLR models will be contrasted with those from bootstrap models.
KPCO-II cohort will be used for external validation of the PLR models. Because the incidences of stroke and major bleeding in KPCO-I and KPCO-II cohort may be different, we will update the original models for the validation cohort.55–57 Then the assessment of calibration, goodness-of-fit and discriminability will be again performed in KPCO-II cohort. We will also use KPCO-II cohort to externally validate the prediction model for death.
Ethics and dissemination
Results from this study will be published in a peer-reviewed journal electronically and in print. The prediction models can provide comprehensive information on the individual combined benefit and harm with and without warfarin for patients with AF, which may aid in patient-physician shared decision-making when they are considering warfarin therapy. For the warfarin users, the models will also help enhance patients’ medication adherence once the patients are clear about their individual predicted risk of outcomes, after they initialise warfarin therapy in the real world.
We thank Dr Jennifer A Pereira for her substantial preparatory contribution in her unpublished doctoral thesis (JA Pereira. Unpublished doctoral dissertation, 2008) to this study.
Contributors GL, AH, TD, DMW and LT were responsible for the study conception and design. GL, AH and LT were responsible for the drafting of the manuscript. AH, TD, DMW, MAHL and LT made critical revisions and provided professional and statistical support. All authors approved the final version of the manuscript.
Funding GL received a Father Sean O’Sullivan Research Award, the Research Institute of St. Joe’s Hamilton, and a doctoral award from the CSC.
Competing interests None declared.
Ethics approval the KPCO Institutional Review Board (No.: CO-14-2093) and the Hamilton Integrated Research Ethics Board (No.: 14–394-C).
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.