Objectives To evaluate an algorithm developed for identifying non-small cell lung cancer (NSCLC) candidates among patients with lung cancer with a diagnosis International Classification of Diseases: ninth revision (ICD-9) 162.x code in administrative databases. Algorithm could then be applied for identifying the NSCLC population in order to assess the appropriateness and quality of care of the NSCLC care pathway.
Design Algorithm discrimination capacity to select both NSCLC or non-NSCLC was carried out on a sample for which electronic health record (EHR) diagnosis was available. A bivariate frequency distribution and other measures were used to evaluate algorithm’s performances. Associations between possible factors potentially affecting algorithm accuracy were investigated.
Setting Administrative databases used in a specific geographical area of Emilia-Romagna region, Italy.
Participants Algorithm was carried out on patients aged >18 years, with a lung cancer diagnosis from January to December 2017 and resident in Emilia-Romagna region who have been hospitalised at IRST or in one of the hospitals placed in the Forlì-Cesena area and for which EHR diagnosis data were available.
Outcome measures Overall accuracy, positive (PPV) and negative (NPV) predictive values, sensitivity and specificity, positive and negative likelihood ratios and diagnostic OR were calculated.
Results A total of 430 patients were identified as lung cancer cases based on ICD-9 diagnosis. Focusing on the total incident cases (n=314), the algorithm had an overall accuracy of 82.8% with a sensitivity of 88.8%. The analysis confirmed a high level of PPV (90.2%), but lower specificity (53.7%) and NPV (50%). Higher length of stay seemed to be associated with a correct classification. Hospitalisation regimen and a supply of antiblastic therapy seemed to increase the level of PPV.
Conclusion The algorithm demonstrated a strong validity for identifying NSCLC among patients with lung cancer in hospital administrative databases and can be used to investigate the quality of cancer care for this population.
Trial registration number NCT04676321.
- health informatics
- respiratory tract tumours
- information technology
Data availability statement
Data are available upon reasonable request.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
The algorithm covers a medical need referring to a specific category within lung cancer: the non-small cell lung cancer histotype.
Algorithm discrimination capacity is assessed at an individual level verifying diagnosis on electronic health record (EHR) and is based on rigorously statistical procedures.
Incident and prevalent conditions are assumed as correctly identified, without a countercheck verification in EHR.
Generalisation of algorithm is limited to the variability of healthcare delivery settings among different international contexts and different local laws, regulations or customs.
Lung cancer is a highly complex disease that causes a heavy burden on the healthcare system both for its frequency, complexity in clinical management and poor prognosis. In recent years, management of patients with non-small cell lung cancer (NSCLC) has been rapidly changing thanks to the availability of new drugs that challenges the sustainability of National Health Care Systems. In such a context, the availability of tools that allow to measure quality of care within the care pathway is crucial to understand the value of provided care, in terms of health outcomes achieved at the population level per amount of expenditure. With this in mind, it is necessary to build indicators to be applied to the healthcare pathway in order to monitor it continuously, and the first step in this way is to identify the correct population to be measured within the pathway. Using hospital administrative databases, which were designed primarily for accounting and management purposes can be helpful. These databases include a combination of information such as hospital discharge cards, medical and diagnostics procedures, drug prescriptions and laboratory data. They can easily be used to identify diagnoses, treatments and outcomes, as they provide timely and easy access to an inexpensive and large source of knowledge regarding subjects in a defined geographical area. This is the reason why administrative data have been widely exploited in different types of epidemiological, postmarketing surveillance and outcome research studies.1 However, the use of administrative data to unambiguously identify patients characteristics is still challenging, since administrative data are not as rich in clinical details as electronic health records (EHR). The International Classification of Diseases: ninth revision Clinical Modification (ICD-9-CM) codes in hospital discharge abstracts (HDA) is used to identify subjects with lung cancer. However, administrative databases are not suitable to identify NSCLC cases as histological classification is not deducible from ICD-9-CM codes.2 Therefore, an optimal and precise NSCLC case identification is still challenging. And validated algorithms are needed in order to detect patients from administrative databases.3 4
Currently, to our knowledge, there are only two Italian published studies which developed and validated an algorithm for identifying incident lung cancer cases without selection of different types of lung cancer.5 6 At European level, an incidence study to estimate the cost of NSCLC treatments has been completed in France, Germany and UK.7 Unfortunately, the selection of patients with NSCLC was performed without a procedure validation. Conversely, in the USA, several studies aimed to detect NSCLC cases from medical claims databases. Ramsey’s study examinated the sensitivity of different administrative claims data estimating an accuracy from 51.1% to 99.4% in identifying NSCLC incident cases.8 Unfortunately, specificity could not be estimated as any false positives in their data source (insurer data) was not included. Duh et al developed an algorithm in order to identify small cell lung cancer (SCLC) subjects; however, this study considered only stage IV cancer cases.9 In order to identify NSCLC, a modified version reversing the inclusion and exclusion criteria (only patients with metastatic cancer aged >65 years) was suggested.10 An algorithm developed by Turner et al had an accuracy of 92.1%, a sensitivity of 94.8% and a specificity of 81.1% were reported. Despite the excellent performance of the algorithm, the authors acknowledge the poor external validity of their results, mainly attributable to the commercial nature of the health insurance plans.11 In general, the inclusion criteria of these algorithms contain procedures and chemotherapies recommended for patients with NSCLC whereas the exclusion criteria consist of treatment regimens applied to patients with SCLC.7 11 12
As part of the ‘KIND NSCLC study: Key Performance Indicators for the assessment of diagnostic and therapeutic pathway of NSCLC patients: a multicenter study’ (Protocol Code: IRST162.13), selection algorithm of patients with NSCLC was highly recommended. The KIND study (ClinicalTrials.gov NCT04676321) aimed to assess the appropriateness and quality of care in patients with NSCLC identified through administrative health data of three sites (hospitals of Modena, Reggio-Emilia and Forlì-Cesena provinces) placed in Emilia-Romagna region. Using the EHR as the gold standard, the primary aim of our study was to evaluate an algorithm to identify NSCLC incident cases from among a pool of patients with a primary or secondary diagnosis of ICD-9-CM 162.x code reported in the discharge cards. In addition, we also tried to identify possible factors (ie, not directly implied by the algorithm) potentially affecting algorithm accuracy.
NSCLC KIND study population
The study population of NSCLC KIND study consisted of adult patients (aged ≥18 years) residing in Emilia-Romagna region, with a newly diagnosis of NSCLC between January and December 2017 identified in the hospital discharge cards (HDC) and who has been discharged in any one of the participating sites (hospitals of Modena, Reggio-Emilia and Forlì-Cesena provinces) (figure 1).
Setting and data source
Administrative databases of three specific geographical areas of Emilia-Romagna region were queried by an algorithm for identifying eligible study subject NSCLC KIND. The assignment of an anonymised patient identification code to all residents independently from type of admission (inpatient or outpatient) allows deterministic individual cross linked among different databases.
Data were retrieved from:
HDC for case selection and algorithm classification based on ICD-9-CM code. The HDC summarises information from clinical charts regarding type of discharge, primary diagnosis, up to five secondary diagnoses and eleven surgical, diagnostic or therapeutic procedures, codified according to the ICD-9-CM.
Pharmaceutical data (FED and AFT—direct and territorial distribution), such as Anatomical Therapeutic Chemical (ATC) drug classification and supply date, for determining histological diagnosis different from NSCLC based on treatment received.
Algorithm specification to identify eligible patients for NSCLC KIND study
Candidate patients with NSCLC were identified using the ICD-9-CM of malignant tumours of trachea, bronchi and lung (codes 162.x), as primary or secondary diagnosis from HDC (figure 1). For patients with more than one ICD-9-CM 162.x code, the patient index date was defined as the earliest date of year 2017 in which 162.x code appeared. Specific criteria on patient’s cancer history and chemotherapy regimens were applied to both discriminate incident cases from prevalent and to identify other malignancies (non-NSCLC) (figure 1):
Patients who had the same ICD-9-CM diagnosis code 162.x recorded in the 3 years before the year under study.
Patients who had other malignancies ICD-9-CM diagnosis (ICD-9-CM 140.x-161.x, 163.x-195.x, 200.x-208.x or V.10.xx except V10.11 and V10.12) recorded in the 3 years before the year under study.
Patients with at least one therapy administration of Etoposide (code L01CB01 according to ATC classification system) in the following 180 days from the index date were classified as non-NSCLC.
Patients with at least one therapy administration of Lanreotide (ATC code H01CB03) and/or Octreotide (ATC code H01CB02) in the following 180 days from the index date were also identified as non-NSCLC.
Algorithm evaluation population
Algorithm discrimination capacity was evaluated only on a restricted sample of patients for which EHR was available. Evaluation of the algorithm was carried out only on patients hospitalised (at least once time) at IRST or in one of the hospitals placed in the Forlì-Cesena area (M. Bufalini of Cesena, G.B. Morgagni—L. Pierantoni of Forlì) with a verifiable diagnosis through EHR (figure 1). The evaluation of the algorithm was assessed at an individual level by linkage of cases identified by the algorithm (both NSCLC or non-NSCLC) to cases in the EHR.
Bivariate frequency distribution of algorithm classified cases (test) and EHR verified cases (gold standard) results were produced for each analysis set and presented as 2×2 tables reporting the number of true positive (TP), false positive (FP), false negative (FN) and true negative (TN). The analysis estimated sensitivity and specificity, with their corresponding 95% CI. Positive predicting values (PPV) and negative predicting value (NPV) were also determined, along with their 95% CIs. The overall accuracy, expressed as the proportion of correctly classified subjects (TP +TN) among all subjects was therefore established. ROC curves were drawn and area under the curve (AUC) were also estimated to estimate accuracy. Moreover, both positive (LR+) and negative (LR−) likelihood ratios, defined as the ratio of the probability of an expected test result in subjects with the disease to the probability in the subjects without the disease, were calculated. Lastly, we determined the ratio of the odds of positivity in subjects with the disease to the odds in subjects without the disease known as the diagnostic OR (DOR). In case of doubtful diagnosis in the EHR, we adopted a conservative approach considering cases as non-NSCLC. Univariate multinomial logistic regression models were developed to further explore the associations between HDC information and algorithm classification correctness; each record has been classified in four categories: correctly or wrongly detected as NSCLC and correctly or wrongly identified as non-NSCLC. ORs with their 95% CIs were calculated separately for false NSCLC, false non-NSCLC (other) and true other and compared with true NSCLC (reference group). A stepwise approach was used for regression selection in multivariate analyses. Among factors included in the final multivariable model, both correlation and variance inflation factor (VIF) have been calculated to assess the presence of multicollinearity issues. Analysis of data was performed using R statistical software (www.r-project.org) V.3.6.3.
Patient and public involvement
No patient involved.
Data from 430 patients with an HDC in which ICD-9-CM diagnosis code 162.x were collected and for which a verification of diagnosis data on IRST’s EHR was available, were included in the sample population. Among the overall sample, 314 patients were identified as incident cases during 2017 (no previous malignancies diagnosis during the previous 3 years) and considered for the main evaluation. However, a secondary analysis on broader populations (eg, including further 116 prevalent cases) was also performed. Focusing on the total incident cases (N=314) as shown in table 1, among the 256 cases classified by the algorithm as NSCLC: 231 were confirmed to have NSCLC (TP), whereas 25 had different diagnoses (FP) resulting in a considerable 90.2% PPV (95% CI 86.6% to 93.9%). Looking at the cases classified by the algorithm as non-NSCLC: 29 cases (TN) were correctly classified by the algorithm in non-NSCLC group leading to a 50.0% NPV (95% CI 37.1% to 62.9%). Since NSCLC represents the vast majority of lung cancers, and being both PPV and NPV strongly dependent on the disease prevalence in the study population, the high PPV strongly contributed to obtain a remarkable overall accuracy of 82.8%. The algorithm reached a very high level of sensitivity, 88.8% (95% CI 85.0% to 92.7%), on the contrary, a lower specificity was observed (53.7%; 95% CI 40.4% to 67.0%), leading to an AUC estimate of 71.3% (95% CI 64.3% to 78.3%). In this context, the likelihood ratios is interesting: LR+ is 1.92 (95% CI 1.43 to 2.57), while LR− equals 0.21 (95% CI 0.14 to 0.32) resulting in a DOR of 9.23 (95% CI 4.53 to 18.87).
With the purpose of identifying factors affecting algorithm accuracy, we developed a univariate multinomial logistic regression model for available variables collected in the HDC forms (table 2). Among EHR confirmed NSCLC (TP and FN), higher length of stay seemed to be associated with correct classification (OR 0.538; 95% CI 0.288 to 1.005) even if only a slight statistical significance was observed (p=0.052), while patients older than 75 years at hospital admission showed a greater risk of misclassification (OR 2.285; 95% CI 1.045 to 4.993; p=0.038). Among HDC forms the algorithm classified as NSCLC, at least 1 day waiting for admission (OR 0.385; 95% CI 0.167 to 0.887; p=0.025), ordinary regime hospitalizations—vs day hospital—(OR 0.354; 95% CI 0.142 to 0.879; p=0.025), DRG lung disease-specific or oncological treatment related (OR 0.384; 95% CI 0.159 to 0.928; p=0.034) and oncological treatment (ie, ATC L0) during 2017 (OR 0.124; 95% CI 0.036 to 0.426; p=0.001) resulted in protective factors for misclassification (ie, to be false NSCLC, FP). Lastly, looking at the correctly assigned records (TP and TN), patients discharged at home are more likely to be true NSCLC, while patients with at least one prescribed antiblastic therapy during the year are ‘at higher risk’ to be correctly identified as non-NSCLC. We, moreover, developed a multivariable model to try to understand which are the most important predictors of algorithm reliability (table 3), length of stay, ordinary regime hospitalisations, discharges at home and oncological treatment during 2017 were the variables selected in the model adopting a stepwise approach (based on AIC), and no multicollinearity issues were found (see online supplemental tables 1 and 2). Lastly, to account for different proportion of surgical DRG in the day hospital and in the inpatient regime, we conducted a multinomial regression analysis on the ordinary regime only (inpatient) which showed a slight statistical significance (HR=0.509; p=0.056) for the variable LoS on increasing the sensitivity of the algorithm (see online supplemental tables 3 and 4). Based on this results, we looked at the case-identification correctness among patients hospitalised for at least 5 days within an ordinary regime (we did not consider neither the discharge type which was associated with the reduced risk of true negativity to be considered in the context of correct selection, nor the presence of prescription of antiblastic therapy which was a discordant factor resulting protective for false positivity, and simultaneously risky for true negativity): a 93.2% accuracy was achieved in 44 cases, which rose to 100% by eliminating additional 30 patients with hospitalisations of exactly 5 days.
Since the algorithm can be considered as a composite process of multiple steps, we also assess its performance when skipping the prevalent cases removal phase (table 1—panel B): less than 3% of overall accuracy was lost (80.0%) including further 116 prevalent cases (AUC: 68.1%—95% CI 62.3% to 74.0%). The most relevant aspect is perhaps the poor ability of the algorithm to correctly identify the other lung tumours when the prevalent cases are also considered in the analysis (specificity: 49.4%—95% CI 38.3% to 60.4% and npv: 45.9%—95% CI 35.3% to 56.5%).
In order to perform the KIND NSCLC study, with the objective to assess the diagnostic and therapeutic pathway of patients with NSCLC, we developed an algorithm designated to identify NSCLC from hospital administrative databases and we tested its performance, using EHR review as gold standard. While previous Italian studies focused on patients with lung cancer, this research covers a medical need referring to a specific category within lung cancer: the NSCLC histotype. Results showed a high level of accuracy of the algorithm (82.8%) and moreover sensitivity (88.8%). Findings confirmed that the algorithm reached a high level of PPV for identifying NSCLC (90.2%), but modest specificity (53.7%) and NPV (50%). The main reason of modest specificity is due to a misclassification of clinical diagnoses and coding errors in HDC. Furthermore, it should be noted that AUC is not much above 0.7, which is slightly above the cut-offs usually reported in literature for indicating good algorithm’s performance.13 Similar considerations must be made for likelihood ratios. Our findings are consistent with previous studies, where sensitivity ranged between 51.1% and 99.4%,8 12 14 and PPV value was 95.3%.12 This slight variability could be explained considering that patients identification is affected by: the data quality of the database used (which is likely to be more complete and accurate in the US payer insurance databases) and by the different criteria used for the algorithm (often based on pharmaceutical claims). Several studies reported evaluations of NSCLC case selection algorithms based only on drug prescription databases, which however, not allowing to identify untreated cases, by providing an unreal picture of the NSCLC population.12 A fundamental limit that we tried to overcome with the multiple data source method assessed in this study. The study by Turner et al, reported an accuracy of 94.8% with a PPV of 95.3%, even if patients generally ineligible to undergo antiblastic therapy, namely early stages NSCLC and unfit patients, are unlikely to be selected.12 The choice to select patients starting from discharge cards made us lose accuracy, although well over 80% while maintaining very high levels of both PPV (90.3%) and sensitivity (88.9%), mostly at the expense of specificity and NPV which, however, is counterbalanced by the low prevalence of other malignancies among those of the lung. Moreover, in accordance with Italy and Emilia-Romagna region’s regulations, drugs in off-label use or antiblastic therapies administered to patients participating in clinical trials are not tracked in the administrative data flows, causing a potential loss of many cases. An additional objective of this study was the identification of the factors influencing the accuracy of the algorithm, providing a clinical (and/or administrative) interpretation of these factors. These results can be transposed in other contexts and give an a-priori estimation of the algorithm accuracy applying their setting characteristics. When evaluating the hospitalisation regimen (ordinary vs day hospital), we can deduce its influence on PPV, in particular we observe how day hospital setting increases the risk of false positivity: we know that in the Emilia-Romagna region, chemotherapy administration were done on an outpatient basis (day service), while day-hospital is used for some invasive diagnostic services (eg, biopsy). In such cases, secondary lesions investigation is not uncommon to assess the presence of metastases, but also to investigate the presence of any other primary malignancy. The associated diagnostic code may therefore have been incorrect. Anyway, the loss to follow-up attributable misclassification is the most frequent: four out of seven wrongly NSCLC classified with DH regimen hospitalisation were patients who decided to be followed at other institutions and for whom the last available and verifiable diagnosis in EHR was not yet certain and, doubtful diagnosis in EHR were considered actual non-NSCLC. The presence of oncological treatment (ie, ATC L0) led to similar results: excluding patients with specific treatments for other neoplasms (eg, neuroendocrine tumour and SCLC) which were classified as non-NSCLC (and for which it is strongly ‘risky’ for correct classification as true non-NSCLC), the PPV seems to increase for patients with a supply of antiblastic therapy (during the year). The incoming of new therapies in the treatment landscape of lung cancer could influence the algorithm performance, the introduction of new drugs specific for the treatment of the SCLC or neuroendocrine cancer would allow the algorithm to better discriminate NSCLC from other lung malignancies. Conversely, in case of new drugs with therapeutic indications for the treatment of both NSCLC and non-NSCLC, the performance of the algorithm could be negatively affected. This analysis, moreover, suggests that a higher length of stay (>5 days) may be an important factor associated with correct classification. This is probably linked with the timing of histological diagnosis: longer hospitalisations are more likely to have histological diagnosis available before HDC completion, resulting in less codification errors. Reflection is required by observing the reduced sensitivity effect of the algorithm among patients aged >75 years (univariate logistic regression model): elderly patients may receive less specific treatments, which makes administrative data less precise for the algorithm’s purpose. Additionally, elderly are more likely to have more comorbidities than their younger counterparts. Synchronous pathologies may worsen the clinical picture and require additional treatments which must be reported in the HDC (as they often absorb many resources). As a result, HDCs are missing information that would be useful for histology discrimination.
The main limitation of the study, as all the other studies conducted on administrative databases (including pharmaceutical claims DBs), is due to the variability of healthcare delivery settings among different international contexts, but also, sometimes, due to different local laws, regulations or customs: in some contexts, the delivery of drugs is allowed only on an outpatient setting (less hospital discharge cards), as well as some diagnostic procedures, generating heterogeneity in the data sources of the algorithm. Actually, the intent of multinomial logistic models was precisely to give an idea of the levels of accuracy that can be achieved even in application contexts other than ours. Although we only collected data from a few geographical areas of the Emilia-Romagna region, the same administrative data are available nationwide. The algorithm may therefore be used to estimate the national incidence of NSCLC.
Another limitation of this study may be to assume incident and prevalent cases as correctly identified (no countercheck with the EHR). Nevertheless, we are quite sure about prevalent classification correctness because of an oncological diagnosis in the previous 3 years. This reasonable confidence decreases for incident cases, although no hospitalisation in the previous 3 years for a patient with a malignant disease is highly unlikely.
In summary, the results of this study demonstrate that our algorithm may be useful for identifying newly diagnosed patients with NSCLC in hospital administrative databases Thanks to the widespread use of these databases, the assessment of the performance on NSCLC care pathway, applying to the identified population a set of KPIs, could be feasible everywhere in Italy, not only for a direct measurement of the patient journey, but also for a benchmark process between different hospitals that could help to improve quality of care for patients.
Data availability statement
Data are available upon reasonable request.
Patient consent for publication
This retrospective study was approved by the Scientific and Medical Committee and the Ethics Committee (EC) of IRST–IRCCS Area Vasta Romagna (CEROM) and of Area Vasta Emilia Nord (AVEN). The approval number/ID was Prot. 1383/2020 I.5/233 for the CEROM and Prot. 2020/0104403 and AOU 0019762/20 for the AVEN.
WB and AR contributed equally.
Contributors WB, AR, VD, IM, MA: contributed to the design of the algorithm, assisted by AD and LC. WB: developed the algorithm and contributed to collection and assembly of data. AR: led statistical analysis. NG: contributed to analyse the data. WB, AR, VD, IM, AD, LC and MA contributed to interpretation of findings. SM: verified the cases on EHR. AR and VD drafted the manuscript, supervised by IM and all authors critically revised the work and approved the final manuscript.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.