Article Text

## Abstract

**Objective** This research proposes a model-based method to facilitate the selection of disease case definitions from validation studies for administrative health data. The method is demonstrated for a rheumatoid arthritis (RA) validation study.

**Study design and setting** Data were from 148 definitions to ascertain cases of RA in hospital, physician and prescription medication administrative data. We considered: (A) separate univariate models for sensitivity and specificity, (B) univariate model for Youden’s summary index and (C) bivariate (ie, joint) mixed-effects model for sensitivity and specificity. Model covariates included the number of diagnoses in physician, hospital and emergency department records, physician diagnosis observation time, duration of time between physician diagnoses and number of RA-related prescription medication records.

**Results** The most common case definition attributes were: 1+ hospital diagnosis (65%), 2+ physician diagnoses (43%), 1+ specialist physician diagnosis (51%) and 2+ years of physician diagnosis observation time (27%). Statistically significant improvements in sensitivity and/or specificity for separate univariate models were associated with (all p values <0.01): 2+ and 3+ physician diagnoses, unlimited physician diagnosis observation time, 1+ specialist physician diagnosis and 1+ RA-related prescription medication records (65+ years only). The bivariate model produced similar results. Youden’s index was associated with these same case definition criteria, except for the length of the physician diagnosis observation time.

**Conclusion** A model-based method provides valuable empirical evidence to aid in selecting a definition(s) for ascertaining diagnosed disease cases from administrative health data. The choice between univariate and bivariate models depends on the goals of the validation study and number of case definitions.

- administrative health data
- chronic disease
- diagnosis
- regression
- rheumatoid arthritis
- validation

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/

## Statistics from Altmetric.com

### Strengths and limitations of this study

Studies about the validity (ie, sensitivity and specificity) of disease case definitions for administrative health data typically rely on descriptive methods to select one or more case definitions for use. Our study proposes and demonstrates a model-based method that provides empirical evidence to support case definition development.

Our method can be applied to diseases that are ascertained from diagnoses and/or prescription medication information in administrative health data, for which one or more validity measures are produced: sensitivity, specificity, positive predictive value, negative predictive value and summary measures such as Youden’s index.

A limitation of our method is that it cannot be applied to validation studies with a small number of case definitions (ie, <50 case definitions).

## Introduction

Administrative health data are widely used for research and surveillance studies because they are relatively inexpensive to access, cover entire populations and can be linked to create longitudinal patient-specific records of healthcare use. However, one limitation of administrative health data is their potentially low sensitivity and specificity for ascertaining patients with chronic diseases.1–4 Therefore, validation studies are an essential tool for assessing data quality. A validation study compares cases ascertained from administrative health data with clinically confirmed cases and produces accuracy estimates (eg, sensitivity and specificity) for one or more case definitions.4 5 Many studies routinely test multiple case definitions. For example, in the rheumatoid arthritis (RA) validation literature, several studies reported more than 40 case definitions.6–8

Selecting a single case definition from among the many that may be tested in a single study is not a straightforward process; sensitivity and specificity estimates often vary with case definition criteria such as the number of diagnosis codes, number of years of data used to ascertain cases and patient characteristics (eg, age and sex). Published guidelines recommend selecting a case definition by prioritising a single validity measure.9 Moreover, these guidelines recommend that validation studies report all case definitions and at least four validity measures.10 11 Thus, a single validation study may result in a large volume of case definition information. Many researchers rely solely on descriptive analyses to summarise these data and select a case definition from among those that are tested.1 3 6 However, the case definition with the highest diagnostic validity estimate may not be more accurate than the case definition with the next highest diagnostic validity estimate, due to sampling error in the estimates. Inferential methods can be used to support the selection of a case definition; they can provide valuable empirical evidence about the case definition criteria associated with validity estimates. However, to date, there have been no recommendations from published guidelines of the use of inferential methods to analyse diagnostic validity estimates.10 11

We propose a model-based method to facilitate the selection of disease case definitions from validation studies for administrative health data. The objectives are to: (A) test administrative health data criteria associated with the validity of case definitions, and (B) compare competing models applied to case definition validity estimates. The model-based method is demonstrated for an RA validation study.

## Methods

### Data source

Study data were from an RA validation study6 conducted using administrative data from 1 April 1991 to 31 March 2011 for patients from Ontario, Canada. Case definitions for administrative health data were developed using medical records for 450 patients from 18 rheumatology clinics as the gold standard. Physician billing claims, hospital discharge abstracts and emergency room (ER) records were used to develop case definitions for all patients; in addition, pharmacy data were used to develop case definitions for patients aged 65+ years.

The published study data reported on validity estimates for 61 case definitions. Validity estimates for an additional 87 case definitions not reported in the publication were provided by the first author. Thus, a total of 148 case definitions were available for analysis. Of this number, 57 case definitions (38.5%) were tested for individuals 20+ years and 91 case definitions (61.5%) were tested for individuals 65+ years. All case definitions tested in the 20+ years age group were also tested in the 65+ years age group. The remaining 34 case definitions for the 65+ years age group included prescription medication criteria.

### Study variables

The case definitions were described using the following criteria (table 1): age group, number of diagnoses in hospital discharge records, number of diagnoses in ER records, number of diagnoses in physician records, number of specialist physician diagnoses, length of physician diagnosis observation time, ≥60 days of separation between physician diagnoses, exclusion criteria A, exclusion criteria B, number of RA-related medications including steroids and number of RA-related medications excluding steroids. Diagnoses in hospital discharge records and diagnoses in ER records were ascertained using an unlimited observation period. A case definition with no physician diagnoses corresponds with having no physician diagnosis observation time. The exclusion criteria A was defined by the authors as follows: exclude individuals with at least two physician diagnosis codes with a different rheumatology diagnosis to an RA diagnosis; this includes osteoarthritis, gout, polymyalgia rheumatic, other seronegative spondyloarthropathy, ankylosing spondylitis, psoriasis, synovitis/tenosynovitis/bursitis, connective tissue disorder, vasculitis and others. The exclusion criteria B was defined by the authors as follows: exclude individuals who had an RA diagnosis code not confirmed by a specialist. The RA-related medication criteria were set to missing for the case definitions applied to the 20+ age group, because medication data were not available for this age group.

The study dataset included each case definition as an observation. The case definition criteria and estimates of sensitivity and specificity were included as variables. Youden’s index (ie, sensitivity + specificity −1)12 was calculated from the estimates of sensitivity and specificity.

### Statistical analyses

Descriptive analyses of the case definition attributes and the estimates of sensitivity, specificity and Youden’s index were conducted using frequencies, percentages and means to inform the model fitting process. All criteria were treated as ordinal measures. Spearman correlation coefficients were used to identify potential collinearity (defined as a correlation of 0.70 or greater13) among the case definition criteria.

The following models were fit to the data based on previous research14: (A) separate univariate fixed-effects models for sensitivity and specificity, (B) univariate fixed-effects model for Youden’s index and (C) bivariate (ie, joint) mixed-effects model for sensitivity and specificity. For the univariate models, sensitivity, specificity or Youden’s index were the outcome variables, and the case definition criteria were covariates. The bivariate mixed-effects model jointly modelled sensitivity and specificity as the outcome variables, and the case definition criteria were covariates. In the bivariate model, estimates of sensitivity and specificity were treated as repeated measures to account for their dependence.

Each covariate (ie, case definition criteria) was first tested in unadjusted models. Multivariable models were subsequently fit to the data; only the covariates that were statistically significant at *α*=0.01 and explained more than 1% of the variation in the unadjusted models (based on the pseudo *R*
^{2} statistic15) were retained. The pseudo *R*
^{2} statistic was calculated using the likelihood statistics from the unadjusted model and the null model (ie, model with no covariates) using McFadden’s method.15 A nominal *α*=0.01, based on the Bonferroni correction, was used to evaluate statistical significance in the multivariable model to limit the overall probability of a Type I error.16 The adjusted models were fit to the data for all age groups (n=148) and then fit to the data only for the 65+ age group (n=91). Univariate models were reported with the percent of explained variance. All model estimates were reported with 99% CIs.

The data were modelled using a beta distribution with a logit link function as recommended in previous research.14 The mixed-effects bivariate model used the Cholesky decomposition to ensure that the estimated variance–covariance matrix of the random effects was positive semidefinite and the model converged.17 All analyses were conducted using SAS V.9.3.18

## Results

### Descriptive analyses

As shown in table 2, two-thirds (64.9%) of the case definitions that were applied to the data for all age groups used 1+ hospital discharge record in an unlimited observation period as a criterion. At least one diagnosis in ER records in an unlimited observation period was a criterion of 6.8% of the case definitions. Physician claims were used in 94.6% of the case definitions. The case definitions used physician diagnosis observation periods of never (ie, when physician claims were not used, 5.4%), ≤1 year (25.7%), ≤2 years (27.0%), ≤5 years (29.7%) and an unlimited physician diagnosis observation period (12.2%) to ascertain physician diagnoses. At least one specialist diagnosis was included as a criterion in half of the case definitions (51.4%). A time separation of ≥60 days between two physician diagnoses was used in 14 (9.5%) case definitions. Exclusion criteria A and B were infrequently used in the case definitions (6.8% and 5.4%, respectively). Of the 91 case definitions for the 65+ age group, 11.5% required 1+ RA-related medication including steroids to ascertain cases and 11.5% of case definitions used 1+ RA-related medication excluding steroids to ascertain cases. Compared with the case definitions for the 65+ years age group, the case definitions for the 20+ age group had slightly lower average estimates of sensitivity (20+ years: 90.9 and 65+ years: 91.3), specificity (20+ years: 82.2 and 65+ years: 86.1) and Youden’s index (20+ years: 73.1 and 65+ years: 77.4).

The following case definition criteria were highly correlated (data not shown): exclusion criteria A and B (r=0.89; p<0.0001), exclusion criteria A and ≥60 days of separation between physician claims (r=0.83; p<0.0001) and exclusion criteria B and ≥60 days of separation between physician claims (r=0.74; p<0.0001). These combinations of case definition criteria were not included in the same model; rather, one criterion from each pair was used in a model at a time.

### Inferential analyses

Table 3 reports the percent pseudo *R*
^{2} statistics for the univariate and bivariate unadjusted models for all case definitions. The number of physician diagnoses and physician diagnosis observation time were the two criteria that explained the most variation in all models (19.2%–82.6% and 16.0%–77.4%, respectively). In the unadjusted univariate models, sensitivity was significantly associated (p<0.01) with the number of physician diagnoses, length of physician diagnosis observation time, ≥60 days of separation between physician diagnoses, exclusion criteria A and B and number of RA-related medications excluding steroids. Specificity was significantly associated (p<0.01) with the number of physician diagnoses, number of specialist diagnoses, length of physician diagnosis observation time, ≥60 days of separation between physician diagnoses and number of RA-related medications excluding steroids. The unadjusted univariate models also revealed that Youden’s index was significantly associated (p<0.01) with the number of diagnoses in ER records, number of physician diagnoses, number of specialist diagnoses, length of physician diagnosis observation time, ≥60 days of separation between physician diagnoses and exclusion criteria A and B. The joint estimates of sensitivity and specificity were significantly associated (p<0.01) with the number of hospital diagnoses, number of physician diagnoses, number of specialist diagnoses, length of physician diagnosis observation time, ≥60 days of separation between physician diagnoses, exclusion criteria A and B and number of RA-related medications excluding steroids in the bivariate models.

When all case definitions (n=148) were considered, the adjusted univariate model of sensitivity showed that increasing the length of physician diagnosis observation time to unlimited compared from 1 year was associated with a statistically significant increase in sensitivity, and ≥60 days of separation between physician diagnoses significantly decreased sensitivity (figure 1). Similar relationships were found for the models for the case definitions from the 65+ years, where prescription medication data were available (n=91).

When all case definitions (n=148) were considered, the univariate model of specificity showed that using 2+ and 3+ physician diagnoses were associated with a statistically significant increase in specificity compared with one physician diagnosis, increasing the physician diagnosis observation time to ≤2 years and ≤5 years from 1 year significantly decreased specificity and 1+ specialist diagnosis significantly increased specificity (figure 1). When only the case definitions for the 65+ age group (n=91) were considered, the results showed similar relationships. Also, the number of RA-related medications excluding steroids was associated with a statistically significant increase in specificity.

Based on the unadjusted models, the univariate model with all case definitions (n=148) for Youden’s index included the case definition criteria of number of diagnoses in ER records, number of physician diagnoses, physician diagnosis observation time, number of specialist visits and ≥60 days of separation between physician diagnoses. However, the number of ER records and physician diagnosis observation time criteria were not statistically significant and model fit improved when they were removed (data not shown). The adjusted univariate model of Youden’s index showed that using 2+ and 3+ physician diagnoses to ascertain cases significantly increased Youden’s index compared with 1+ physician diagnosis, 1+ specialist diagnosis significantly increased Youden’s index and a time separation ≥60 days between diagnoses significantly decreased Youden’s index (figure 2). When the case definitions for the 65+ population (n=91) were considered, a similar pattern emerged. Using the number of RA-related medications excluding steroids to ascertain RA cases resulted in a statistically significant increase in Youden’s index.

When all case definitions were analysed using the adjusted bivariate model, 2+ and 3+ physician diagnoses were associated with a statistically significant increase in specificity and no association with sensitivity compared with 1+ physician diagnosis (figure 3). Increasing the number of physician diagnosis observation years from 1 year to ≤2 years, ≤5 years and unlimited observation period were associated with a statistically significant increase in sensitivity. Increasing the number of physician diagnosis observation years from 1 year to ≤2 years and ≤5 years were associated with a statistically significant decrease in specificity. Using ≥60 days of separation between physician diagnoses was associated with a statistically significant decrease in sensitivity but no significant change on specificity. Increasing the number of specialist diagnoses significantly increased specificity. When only the case definitions applied to the 65+ age group (n=91) were analysed, the relationships in the all age groups model remained statistically significant. Including 1+ RA-related medications excluding steroids decreased sensitivity and increased specificity.

## Discussion

This study applied regression models in a secondary analysis of administrative health data to identify case definition criteria associated with validity estimates from a study about RA case definitions. Based on the results of the adjusted univariate model, sensitivity was associated with the number of physician diagnoses, physician diagnosis observation time and length of time between physician diagnoses. Based on the results of the univariate model, specificity was associated with the number of physician diagnoses, physician diagnosis observation time, specialist diagnoses and RA-related medications excluding steroids. Based on the univariate model results, Youden’s index was associated with the number of physician diagnoses, specialist diagnoses, length of time between physician diagnoses and number of RA-related medications excluding steroids. For the bivariate model, sensitivity was associated with the number of physician diagnoses, physician diagnosis observation time, amount of time between physician diagnoses and number of RA-related medications excluding steroids. In this same model, specificity was associated with the number of physician diagnoses, physician diagnosis observation time, specialist diagnoses and RA-related medications excluding steroids.

All of the models resulted in similar performance in our numeric example, but this may not always be the case. Selection of one model over competing alternatives depends on the study goals and the number of case definitions. Overall, however, the bivariate model is recommended when the number of case definitions is large and sensitivity and specificity are moderately or highly correlated. The univariate model applied to Youden’s index is recommended when the researcher places equal weight on maximising sensitivity and specificity.12 19–21 However, Youden’s index can result in the same estimate for different combinations of sensitivity and specificity. Thus, univariate models applied separately to sensitivity and specificity are recommended when the researcher does not place equal weight on these validity measures.22

A validation study is typically used to produce recommendations about selecting one or more case definitions for maximum accuracy in ascertaining disease cases. Using our model-based method, one can identify the case definition criteria associated with one or more measures of validity and use them to construct a case definition. Based on the univariate models for all age groups, the recommended RA case definition has the following attributes: (A) 2+ physician diagnoses, (B) unlimited physician diagnosis observation time and (C) 1+ specialist diagnosis. At least two physician diagnoses and 1+ specialist diagnosis were associated with specificity but not with sensitivity. An unlimited observation time was significantly associated with improvements in sensitivity. For the 65+ age group, the recommended RA case definition had the same case definition attributes and also included 1+ RA-related medication excluding steroids. The univariate model of Youden’s index and bivariate model of sensitivity and specificity produced similar results.

The recommended case definition based on the univariate sensitivity and specificity models is simpler than the recommended case definition from Widdifield *et al;*
6 however, both case definitions produce similar diagnostic accuracy estimates. The primary difference between the two recommended case definitions is that Widdifield *et al* recommended using one diagnosis in hospital discharge records to ascertain cases, while our model-based approach did not support this. Our recommendation derived from a model-based approach might lead to subsequent reanalysis of the original validation data to produce estimates of sensitivity and specificity for the model-supported case definition.

The case definition with the highest sensitivity or specificity estimates may not be significantly more accurate than other case definitions. A model-based approach provides empirical evidence about the case definition criteria that are associated with significant increases/decreases in validity estimates.

While this research focused on inferential techniques for diagnostic validation studies, design of such studies is also an important consideration to ensure that the effects of the criteria can be accurately estimated. Ensuring that all possible combinations of criteria are investigated is an important consideration.23 When a criterion is included in only a small number of case definitions, the power to detect the effect of the criterion on diagnostic validity estimates may be low.

This study has some limitations. Other parametric and non-parametric models have been proposed to combine estimates from a single case definition across multiple studies, such as copulas techniques,14 24 mixture models25 26 and mixed models of summary receiver operating characteristic curves.27 28 However, the models selected for this study have many applications and are most likely to be familiar to researchers. A beta distribution may not always be an appropriate choice for Youden’s index, because this index can, in theory, range from −1 to +1. However, in practice, values of Youden’s index less than zero are rare.

Inferential methods cannot be applied to the estimates from all diagnostic validation studies; overfitting the data may be a problem when the number of case definitions is small, or when the number of case definition criteria is large relative to the total number of definitions. Green29 suggested a minimum of 50 observations plus eight observations for every parameter estimated from a multiple regression model. Based on this, our model-based approach would require a minimum of 50 case definitions, and preferably more, in order to be implemented.29 In such cases, descriptive analyses must be relied on to select a case definition. Lastly, the validation study design may limit the ability to test interactions between criteria. Testing interaction effects would require as many combinations as possible of the criteria to be included in the validation study to allow for reasonable model power.

This study has a number of strengths. The models applied to the case definition data from a single study have previously been used in meta-analyses to combine diagnostic validity estimates for a single case definition across multiple studies.14 30 31 Second, these methods enable modelling of case definitions from published validation studies and when the individual-level administrative health data are not available. Another strength of this study was that the effects of publication bias on the study results was minimised by analysing all case definition data provided by the study authors. Finally, the methods used in this study can be applied to other chronic diseases and other diagnostic validity measures such as positive predictive value and/or negative predictive value.

## Conclusion

This study applied model-based methods to a single validation study to select case definition criteria associated with validity measures such as sensitivity and specificity. The model-based results can be used by researchers to empirically guide the selection of a case definition for implementation in subsequent cohort studies and surveillance initiatives.32 33 Empirical methods can be used to quantify the magnitude of change in estimates of accuracy associated with different case definition criteria. This research contributes to accurate methods for using administrative health data to study chronic diseases such as RA.

## References

## Footnotes

Contributors KK led the conception and design of the study, analysis and interpretation of the data and drafted the article. LML provided guidance on the conception and design of the study, assisted in the analysis and interpretation of the data and was involved in revising the article. JW provided access to the study data, assisted in interpretation of the data and was involved in revising the article. DJ and SM provided guidance on the conception and design of the study, assisted in the analysis and interpretation of the data and were involved in revising the article. All authors read and approved the final manuscript.

Funding The first author was supported by the Canadian Institutes of Health Research (CIHR) Drug Safety and Effectiveness Network grant TD3.137716 through the scholarship from the Canadian Network for Advanced Interdisciplinary Methods for comparative effectiveness research (CAN.AIM) team. This work was supported by the Canadian Institutes of Health Research (CIHR) (www.cihr.irsc.gc.ca) through Canadian Masterâ€™s Scholarship funding from CIHR to the first author.

Competing interests None declared.

Patient consent This study does not involve human subjects.

Provenance and peer review Not commissioned; externally peer reviewed.

Data sharing statement The datasets used and analysed during the current study are available from JW on reasonable request.

Correction notice This paper has been amended since it was published Online First. Owing to a scripting error, some of the publisher names in the references were replaced with 'BMJ Publishing Group'. This only affected the full text version, not the PDF. We have since corrected these errors and the correct publishers have been inserted into the references.