Objective The majority of cardiovascular diagnoses in the Danish National Patient Registry (DNPR) remain to be validated despite extensive use in epidemiological research. We therefore examined the positive predictive value (PPV) of cardiovascular diagnoses in the DNPR.
Design Population-based validation study.
Setting 1 university hospital and 2 regional hospitals in the Central Denmark Region, 2010–2012.
Participants For each cardiovascular diagnosis, up to 100 patients from participating hospitals were randomly sampled during the study period using the DNPR.
Main outcome measure Using medical record review as the reference standard, we examined the PPV for cardiovascular diagnoses in the DNPR, coded according to the International Classification of Diseases, 10th Revision.
Results A total of 2153 medical records (97% of the total sample) were available for review. The PPVs ranged from 64% to 100%, with a mean PPV of 88%. The PPVs were ≥90% for first-time myocardial infarction, stent thrombosis, stable angina pectoris, hypertrophic cardiomyopathy, arrhythmogenic right ventricular cardiomyopathy, takotsubo cardiomyopathy, arterial hypertension, atrial fibrillation or flutter, cardiac arrest, mitral valve regurgitation or stenosis, aortic valve regurgitation or stenosis, pericarditis, hypercholesterolaemia, aortic dissection, aortic aneurysm/dilation and arterial claudication. The PPVs were between 80% and 90% for recurrent myocardial infarction, first-time unstable angina pectoris, pulmonary hypertension, bradycardia, ventricular tachycardia/fibrillation, endocarditis, cardiac tumours, first-time venous thromboembolism and between 70% and 80% for first-time and recurrent admission due to heart failure, first-time dilated cardiomyopathy, restrictive cardiomyopathy and recurrent venous thromboembolism. The PPV for first-time myocarditis was 64%. The PPVs were consistent within age, sex, calendar year and hospital categories.
Conclusions The validity of cardiovascular diagnoses in the DNPR is overall high and sufficient for use in research since 2010.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Strengths and limitations of this study
This is the first validation study to include all major cardiovascular diagnoses in the Danish National Patient Registry.
We sampled patients only from hospitals in the Central Denmark Region. However, our results are most likely generalisable to other parts of the country as the Danish healthcare system is homogeneous in structure and practice.
We only validated patients diagnosed during 2010–2012 and therefore cannot extrapolate our results to previous periods.
Remarkable improvements have occurred in the prevention and treatment of cardiovascular diseases during recent decades.1–4 Still, cardiovascular diseases remain a leading cause of death worldwide,5 underscoring the need for further research. Registries constitute an important source of data for cardiovascular research in Denmark. The key registry is the Danish National Patient Registry (DNPR),6 which contains long-term longitudinal data, prospectively collected since 1977. The registry has nationwide coverage of a homogeneous healthcare system with free and equal access and holds the possibility of individual-level data linkage with other registries.7 ,8 However, the quality of registry-based research largely depends on the validity of the diagnostic codes used. Existing validation studies for cardiovascular diagnoses in the DNPR have been limited to relatively few diagnoses.6 We therefore conducted a validation study to examine the positive predictive value (PPV) of diagnoses in the DNPR for all major cardiovascular diseases.
Denmark is divided into five regions, each of which is representative of the Danish population with respect to demographic and socioeconomic characteristics as well as healthcare usage and medication use.9 Each region typically has one major university hospital (including a high volume cardiac centre) and several smaller regional hospitals. The Danish National Health Service provides free universal tax-supported healthcare, guaranteeing unfettered access to general practitioners and hospitals.6
We used the DNPR to randomly sample inpatient and outpatient hospital diagnoses from the Central Denmark Region between 1 January 2010 and 31 December 2012. The Central Denmark Region has a source population of 1.2 million inhabitants. Within the Central Denmark Region, we sampled specifically from the university hospital (Aarhus University Hospital) and two regional hospitals (Regional Hospitals of Randers and Herning).8 The DNPR has recorded data on dates of admission and discharge from all Danish non-psychiatric hospitals since 1977 and on dates of emergency room and outpatient clinic visits since 1995.6 Each hospital discharge or outpatient visit is recorded with one primary diagnosis and one or more secondary diagnoses classified according to the International Classification of Diseases, 8th Revision (ICD-8) until the end of 1993 and 10th Revision (ICD-10) thereafter.6
Our study population consisted of patients discharged with a primary or secondary first-time diagnosis from departments of cardiology, internal medicine, acute medicine and neurology in the three hospitals. For myocardial infarction, heart failure and venous thromboembolism, we also validated recurrent events. For most diseases, both inpatient and outpatient diagnoses were included (see online supplementary table S1). However, for diseases expected only to be diagnosed at inpatient admission (eg, myocardial infarction, aortic dissection, cardiac arrest), we only sampled inpatient diagnoses to avoid potential misclassification. Up to 100 patients were sampled from the DNPR for each of the diagnoses, which included first-time acute myocardial infarction (subsequently stratified by ST-elevation (STEMI) and non-ST-elevation myocardial infarction (NSTEMI)), recurrent myocardial infarction, stent thrombosis, stable angina pectoris, unstable angina pectoris, first-time heart failure, heart failure readmission, arterial hypertension, pulmonary hypertension, atrial fibrillation or flutter, bradycardia, ventricular tachycardia or fibrillation, cardiac arrest with indication for resuscitation, endocarditis, myocarditis, pericarditis, first-time venous thromboembolism (subsequently stratified by deep venous thrombosis and pulmonary embolism), recurrent venous thromboembolism (subsequently stratified by deep venous thrombosis and pulmonary embolism), arterial claudication, hypercholesterolaemia and cardiac tumours. We sampled up to 100 cases for cardiomyopathy (by sampling 20 diagnoses each for dilated, hypertrophic, restrictive, arrhythmogenic right ventricular and takotsubo cardiomyopathy), valvular heart disease (sampling 50 diagnoses each for mitral valve regurgitation or stenosis, and aortic valve regurgitation or stenosis) and aortic diseases (sampling 50 diagnoses each for aortic dissection and aneurysm/dilation).
Recurrent myocardial infarction and readmission due to heart failure were defined as the first readmission after the initial diagnosis. Sampling of first-time and recurrent events was independent. Hence, recurrent events could potentially include patients also included in the random sample for validation of first-time events. To avoid situations in which a transfer from one department to another was registered as a new diagnosis, we required that patients should be discharged for >24 hours before readmission could be registered as a true recurrent event. Bradycardia was defined as sinus node dysfunction or atrioventricular block. For venous thromboembolism, we defined recurrent events as admissions occurring >3 months after the initial diagnosis as guidelines recommend at least 3 months of anticoagulant therapy following venous thromboembolism.10 All ICD codes used in the study are provided in online supplementary table S1. The patients were sampled using SAS V.9.2 (SAS Institute, Cary, North Carolina, USA).
Medical record review
Medical record review was used as the reference standard. We did not have access to ECGs or other paraclinical recordings that supported the clinician's decision. However, descriptions of such recordings were available in the medical records and included in the review process. Three physicians (JS, KA and TM) reviewed the medical records and judged whether they confirmed the cardiovascular diagnosis coded in the DNPR. If the diagnosis was not described in the discharge summary or if the discharge summary was not available, the full medical record was reviewed to examine whether the diagnosis code could be confirmed. Review of the discharge summary/medical records began with confirmation of the Civil Personal Register number (unique personal identifier) and discharge date for each hospital contact retrieved from the DNPR. The diagnoses from the discharge summary and/or medical records were then compared with the diagnoses in the DNPR. Events coded in the DNPR as recurrent were considered correct if they were truly new events (for myocardial infarction and venous thromboembolism) and for heart failure if the readmission was due to a heart failure exacerbation. If the reviewing physician was uncertain whether the discharge summary or medical record agreed with the ICD-10 code, a second independent review was performed by one of the two other physicians. In case of disagreement, a consensus agreement was reached.
For each diagnosis, we computed the PPV with 95% CIs according to the Wilson score method.11 The PPV was computed as the proportion of diagnoses retrieved from the DNPR that could be confirmed in the discharge summary or medical record. For venous thromboembolism (including deep venous thrombosis and pulmonary embolism), we recalculated the PPVs for patients having an ultrasound and/or CT scan recorded in the registry during the index admission and for those who had neither of these registered. To calculate the mean PPV for all cardiovascular diseases, we divided the total number of correct cases by the total number of validated cases. We stratified the analyses by age group (<60 years, 60–80 years and >80 years), sex, calendar year (2010, 2011 and 2012), hospital type (regional or university hospital), type of diagnosis (primary or secondary) and type of hospital contact (inpatient or outpatient). Furthermore, we performed subgroup analyses for myocardial infarction (STEMI and NSTEMI diagnoses) and first-time and recurrent venous thromboembolism (deep venous thrombosis and pulmonary embolism diagnoses).
We identified 2212 patients from the DNPR with cardiovascular diagnoses during 2010–2012. Medical records were available for 2153 patients (97% of the total sample). For the most common diseases, 100 patients were sampled; for rare diseases, fewer patients were available for sampling (figure 1). PPVs ranged between 64% and 100% with a mean PPV of 88%. PPVs were ≥90% for first-time myocardial infarction (including STEMI and NSTEMI), stent thrombosis, stable angina pectoris, hypertrophic cardiomyopathy, arrhythmogenic right ventricular cardiomyopathy, takotsubo cardiomyopathy, arterial hypertension, atrial fibrillation or flutter, cardiac arrest with indication for resuscitation, mitral valve regurgitation or stenosis, aortic valve regurgitation or stenosis, pericarditis, hypercholesterolaemia, aortic dissection, aortic aneurysm/dilation and arterial claudication (figure 1). The distribution of cardiac arrest was 57% out of hospital, 30% inhospital and 13% undetermined. Apart from myocarditis (PPV=64%), the remaining PPVs were between 80% and 90% for recurrent myocardial infarction, unstable angina pectoris, pulmonary hypertension, bradycardia, ventricular tachycardia or fibrillation, endocarditis, cardiac tumours, first-time venous thromboembolism and between 70% and 80% for first-time and recurrent admission for heart failure, dilated cardiomyopathy, restrictive cardiomyopathy and recurrent venous thromboembolism. The PPV for venous thromboembolism improved when the following additional criteria were applied: receipt of CT or ultrasound scan during hospitalisation (PPV=91%), and receipt of both a CT and ultrasound scan during hospitalisation (PPV=100%; table 1).
The PPVs were consistent within age, sex, calendar year and hospital categories (tables 2 and 3). The stratified analyses by type of diagnosis and type of hospital contact revealed that the main results were driven by primary diagnoses from inpatient admissions. Thus, primary and inpatient diagnoses occurred most frequently, and the PPVs associated with these diagnosis types overall tended to be higher than for secondary and outpatient diagnoses (table 4).
The DNPR accurately recorded diagnoses of the most common cardiovascular diseases during 2010–2012, with the PPV exceeding 90% for myocardial infarction, arterial hypertension, atrial fibrillation or flutter, valvular heart disease, aortic diseases and first-time venous thromboembolism. As an exception among the most frequent diseases, the PPV for heart failure was lower. For less common conditions, the PPV varied from 64% for myocarditis to 100% for takotsubo cardiomyopathy. The PPV for recurrent myocardial infarction was 88%, but somewhat lower for readmission for heart failure (76%) and recurrent venous thromboembolism (72%). The lower PPVs for recurrent events are most likely influenced by secondary recordings of the initial event as part of follow-up visits or during successive hospital contacts without the occurrence of a truly new event. The results were consistent in age, sex and calendar year categories.
This is the first validation study to include all major cardiovascular diagnoses in the DNPR. Comparing our results with previous Danish validation studies, it is apparent that the PPVs have improved over time for many cardiovascular diagnoses in the DNPR.6 This may be explained by increased awareness of correct coding, implementation of clear guidelines and definitions of individual diseases, and improved availability of diagnostic modalities.6 Thus, the PPV of coding has improved for myocardial infarction (PPV=100% during 1996–2009,12 98% during 1998–2007,13 92% during 1993–200314 and 93% during 1982–1991,15) arterial hypertension (PPV=88% during 1977–201016 and ≈50% during 1990–199317) and first-time venous thromboembolism (PPV=90% during 2004–201218 and 75% during 1994–200619). The PPVs were overall in line with previous studies for heart failure (PPV=78% during 2005–200720 and 100% during 1998–200713), atrial fibrillation or flutter (PPV=94% during 1993–2009,21 99% during 1980–200222 and 97% during 1980–200223) and recurrent venous thromboembolism (PPV=79% during 2004–2012 with CT or ultrasound scan during admission and anticoagulant treatment 30 days after admission18). Previous studies reported markedly lower PPVs than our findings for unstable angina pectoris (PPV=42% during 1993–200314) and cardiac arrest (PPV=50% during 1993–200314). The finding of lower PPV for cardiac arrest in the previous study14 may be explained by a small sample size (n=42) and their inclusion of emergency department and outpatient diagnoses, whereas we restricted to inpatient diagnoses. Moreover, the previous study is more than 10 years old and changes in coding practice may also account for part of the difference. For unstable angina pectoris,14 the study period of the previous study ended in 2003, that is, shortly after the redefinition of myocardial infarction in 2000, which included troponin release as an absolute criterion.24 This made the discrimination between unstable angina pectoris and myocardial infarction easier and most likely explains the higher PPV found in our study. To the best of our knowledge, the validity of the remaining diagnoses included in our study has not been assessed before.
Several limitations should be considered. Cautious interpretation of the PPV is warranted for diagnoses with sample sizes below 100. These include subgroups of an overall diagnosis and rare diagnoses with less than 100 cases diagnosed during the study period. Original recordings from diagnostic modalities such as ECG and echocardiography were not available. Therefore, the confirmation of the diagnoses was based solely on descriptions of such recordings included in the discharge summary or medical record. This limits the rigorousness of case validation and also could potentially lead to different interpretations between reviewers. We examined patients admitted to hospitals only in the Central Denmark Region. However, our results are most likely generalisable to other parts of the country as the Danish healthcare system is homogeneous in structure and practice.9 Although some diagnoses (eg, myocardial infarction) have shown consistently high validity across countries despite different registry types and coding systems,6 ,12 ,25 it should be noted that our findings may not per se be generalisable to all countries where coding systems, coding practice, disease definitions and diagnostics differ.
In this study, we chose the PPV as the measure of validity. The PPV is correlated with disease prevalence and is dependent on specificity. However, sensitivity, specificity and negative predictive value could not be calculated because the data were sampled from the codes pertinent to the diagnosis of interest. The importance of the different measures of data quality depends on the study question and thus the design. A high PPV is important, for example, when identifying patient cohorts in prognosis studies, but cannot stand alone, for example, when identifying disease incidence. Future studies identifying cardiovascular diseases from diagnoses in the DNPR should consider the possibility that differential misclassification may occur between exposure groups (eg, if the exposure is diabetes, these patients may be more prone to have a given outcome registered due to detection bias and hence have a falsely increased risk of the outcome). Also, since diagnoses were only validated during 2010–2012, we cannot necessarily extrapolate our results to previous periods due to potential temporal differences in PPVs as exemplified above.
The validity of cardiovascular diagnoses in the DNPR is overall high, and for the vast majority of diseases it is sufficient for use in research since 2010.
The authors thank Hanne Moeslund Madsen and Henriette Kristoffersen for practical assistance.
Contributors MS, HTS, JS and KA conceived the study idea and designed the study. TF sampled the patients. JS, KA and TM reviewed all medical records. JS performed the statistical analysis. All authors analysed and interpreted the data. JS wrote the initial draft. All authors critically revised the manuscript for important intellectual content and approved the final version.
Funding This work was supported by the Department of Clinical Epidemiology and the Programme for Clinical Research Infrastructure (PROCRIN) established by the Lundbeck Foundation and the Novo Nordisk Foundation.
Competing interests None declared.
Ethics approval In accordance with Danish law governing analysis of registry data, no Ethics Committee approval was required. The study was approved by the Danish Data Protection Agency (record number: 1-16-02-1-08) and the chief physicians of participating departments, as part of quality control.
Data sharing statement No additional data are available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.