Article Text

Original research
Incidence, duration and risk factors associated with delayed and missed diagnostic opportunities related to tuberculosis: a population-based longitudinal study
  1. Aaron C Miller1,
  2. Alan T Arakkal1,
  3. Scott Koeneman2,
  4. Joeseph E Cavanaugh2,
  5. Alicia K Gerke3,
  6. Douglas B Hornick3,
  7. Philip M Polgreen1,3
  1. 1Epidemiology, University of Iowa, Iowa City, Iowa, USA
  2. 2Biostatistics, The University of Iowa, Iowa City, Iowa, USA
  3. 3Internal Medicine, The University of Iowa, Iowa City, Iowa, USA
  1. Correspondence to Dr Aaron C Miller; aaron-miller{at}


Objectives Missed opportunities to diagnose tuberculosis are costly to patients and society. In this study, we (1) estimate the frequency and duration of diagnostic delays among patients with active pulmonary tuberculosis and (2) determine the risk factors for experiencing a diagnostic delay.

Design A retrospective cohort study of patients with tuberculosis using longitudinal healthcare encounters prior to diagnosis.

Setting Commercially insured enrollees from the Commercial Claims and Encounters or Medicare Supplemental IBM Marketscan Research Databases, 2001–2017.

Participants All patients diagnosed with, and receiving treatment for, pulmonary tuberculosis, enrolled at least 365 days prior to diagnosis.

Primary and secondary outcome measures We estimated the number of visits with tuberculosis-related symptoms prior to diagnosis that would be expected to occur in the absence of delays and compared this estimate to the observed pattern. We computed the number of visits representing a delay and used a simulation-based approach to estimate the number of patients experiencing a delay, number of missed opportunities per patient and duration of delays (ie, time between diagnosis and earliest missed opportunity). We also explored risk factors for missed opportunities.

Results We identified 3371 patients diagnosed and treated for active tuberculosis that could be followed up for 1 year prior to diagnosis. We estimated 77.2% (95% CI 75.6% to 78.7%) of patients experienced at least one missed opportunity; of these patients, an average of 3.89 (95% CI 3.65 to 4.14) visits represented a missed opportunity, and the mean duration of delay was 31.66 days (95% CI 28.51 to 35.11). Risk factors for delays included outpatient or emergency department settings, weekend visits, patient age, influenza season presentation, history of chronic respiratory symptoms and prior fluoroquinolone use.

Conclusions Many patients with tuberculosis experience multiple missed diagnostic opportunities prior to diagnosis. Missed opportunities occur most commonly in outpatient settings and numerous patient-specific, environment-specific and setting-specific factors increase risk for delays.

  • tuberculosis
  • epidemiology
  • general medicine (see internal medicine)
  • respiratory infections

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Strengths and limitations of this study

  • This study reviewed longitudinal healthcare records for a large population of insured enrollees (over 195 million represented) spanning an extensive time period (2001–2017) and covering a range of healthcare settings (inpatient, outpatient and emergency department).

  • A simulation-based analysis was conducted to identify visits most likely to represent a diagnostic delay, while excluding coincidental visits that may appear to be missed opportunities.

  • This study relied on International Classification of Disease Version 9 and 10 (ICD-9/ICD-10) diagnostic codes to identify index cases of tuberculosis, and such codes may lack specificity for identifying active tuberculosis. Medication claims were used to help validate diagnosis codes by identifying patients receiving medications used to treat active tuberculosis.

  • This study also relied on diagnostic codes to identify signs and symptoms of tuberculosis prior to diagnosis. Such records may not capture all visits where symptoms occurred (eg, symptoms recorded in clinic notes). We conducted a sensitivity analysis to evaluate the potential sensitivity of our findings to visits without related symptom codes.

  • Without more granular patient data, we cannot confirm that all patient visits we identify represent diagnostic errors.


The incidence of tuberculosis has been decreasing in the USA during the past several decades,1 2 but recently the rate of decrease has slowed.1 3 To further reduce the incidence of tuberculosis, the rapid identification and treatment of new cases is essential.3 However, as the incidence of tuberculosis decreases, so may familiarity with the disease among clinicians,4 resulting in an increase in diagnostic delays.5 6 Because these delays are, in part, a function of the familiarity and experience of clinicians with a particular disease,5–8 as the disease becomes less common, diagnostic delays for tuberculosis may become more common.5 7

Diagnostic delays of tuberculosis are important to consider for several reasons. First, delays are common in the USA6 7 9 and other lower-prevalence countries.8 10–12 Second, delays may contribute to worse clinical outcomes13–15 and increased healthcare costs.16 Third, diagnostic delays for tuberculosis are especially important because delays contribute to additional exposures and thus additional cases of tuberculosis.17 18 Substantial diagnostic delays contributing to increased transmission have occurred in both community19–22 and healthcare settings.10 23–25

Traditional approaches to investigate diagnostic delays have focused on single centres, most commonly hospitals, or alternatively have depended on public health registries that rely on patient recall.8 26 Although diagnostic errors occur in hospitals, opportunities to understand and reduce diagnostic delays may frequently occur in ambulatory settings where patients often first present with signs and symptoms of a disease. Multiple investigations focusing on emergency department (ED) visits have highlighted missed opportunities to diagnose tuberculosis.6 27–29 Thus, to enable a more complete understanding of diagnostic delays requires consideration of sequential healthcare visits across outpatient clinic visits, ED visits and hospitalisations. Also, when diagnostic delays are detected, it may be difficult to learn about risk factors for diagnostic delays if patients present in multiple different settings before the diagnosis is made.

Before interventions to decrease diagnostic delays can be designed and implemented, a better understanding of the incidence of and risk factors for diagnostic delays is needed, especially in lower-incidence countries. Thus, the goal of this study is to propose a population-based approach for estimating the incidence and duration of diagnostic delays associated with tuberculosis, and also to describe the risk factors associated with patients experiencing a diagnostic delay.


Data source

We used longitudinal insurance claims data from the IBM Marketscan Commercial Claims and Encounters and Medicare Supplemental databases from 2001 through 2017. The Commercial Claims data contain information for individuals with employer-sponsored health plans (employees, retirees, dependents and spouses) from participating large employers, health plans and government organisations. The Medicare supplemental databases contain information for Medicare-eligible individuals with employer sponsored Medicare Supplemental plans. Together, these databases contain claims for over 195 million enrollees across the USA, representing over 6-billion enrolment months. Claims from outpatient, emergency and inpatient visits are provided along with outpatient medications.

Permission to use these data were granted to our research team from IBM. This research used deidentified claims data, studies of this type are deemed non-human subjects research by the University of Iowa Institutional Review Board.

Study population

We identified all patients diagnosed with primary, pulmonary, respiratory or miliary tuberculosis using the ICD-9-CM diagnosis codes 010.X, 011.X, 012.X and 018.X, and the ICD-10-CM codes A15.X and A19.X. Because non-pulmonary tuberculosis presents with different signs and symptoms, we did not include codes for tuberculosis of the central nervous system, intestines, peritoneum, mesenteric glands, bones, joints, genitourinary system or other organs. We required cases to be enrolled for at least 1 year prior to their initial tuberculosis diagnosis; this first diagnosis was labelled as the index diagnosis. Because diagnosis codes alone lack specificity for identifying active tuberculosis,30 we restricted our analysis to patients with evidence of treatment for active tuberculosis near the index diagnosis using outpatient medication claims.31 Specifically, we identified treatment with the following set of medications: isoniazid and rifampicin/rifampin, pyrazinamide or ethambutol. We considered patients whose treatment began within 1 year of the index diagnosis. We performed a sensitivity analysis using cases where treatment occurred within 2 months of diagnosis. If treatment began prior to the initial tuberculosis diagnosis, we used the treatment start date as the index diagnosis date.

Statistical analysis

We conducted two primary statistical analyses to address the following objectives: (1) to estimate the incidence and duration of diagnostic delays associated with tuberculosis, and (2) to estimate the risk factors for experiencing a diagnostic delay. We started by identifying potential diagnostic delays by looking for symptomatically similar diagnoses (SSDs) that occurred during healthcare visits prior to the index tuberculosis diagnosis. We defined SSDs to be diagnoses that include, or share, similar symptoms to active pulmonary tuberculosis. SSDs may include diagnoses in one of four categories:

  1. General symptoms of active infection, such as cough, fever, weight loss or haemoptysis.

  2. Symptomatically similar infections that share similar symptoms to tuberculosis, such as pneumonia, influenza or bronchitis.

  3. Symptomatically similar cardio-sino-pulmonary diseases or syndromes, such as chronic obstructive pulmonary disease (COPD), asthma or lung cancer.

  4. Testing, imaging or physical exam-based diagnoses, such as anaemia or swollen lymph nodes.

Online supplemental table 1 describes the individual diagnoses and ICD-9/10 codes used to identify the four types of SSD conditions. This list was developed based on a review of prior literature of diagnostic delays for tuberculosis.6 We identified SSDs during visits in the time prior to the index diagnosis where diagnostic opportunities may plausibly occur, between 3 and τ days prior; we denoted the period [3, τ] as the diagnostic-opportunity window. The value τ is the upper bound of the diagnostic-opportunity, reflecting the longest plausible diagnostic delay; this is estimated based on a change-point analysis described below. We disregard visits within 3 days of the index diagnosis to account for lags in diagnostic testing. Figure 1 depicts the process used to identify potential diagnostic opportunities. This type of ‘look-back’ approach has been referred to as Symptom-Disease Pair Analysis of Diagnostic Error (SPADE),32 which has been used to identify diagnostic delays associated with numerous diseases.6 33–36

Figure 1

Diagram for identifying symptomatically similar diagnosis (SSD) visits: SSD visits include symptoms, symptomatically similar diagnoses and testing or exam-based diagnoses that suggest an active tuberculosis infection may be present in the patient. Potential diagnostic opportunities are defined as SSD-related visits that occur during the diagnostic-opportunity window (ie, the window prior to index diagnosis where delays are biologically plausible).

Estimating incidence of diagnostic delays

Visits occurring prior to an index diagnosis of tuberculosis that contain an SSD may represent a missed diagnostic opportunity but may also represent a coincidental visit (eg, unrelated respiratory infection). To account for visits representing coincidental diseases, and not a missed opportunity, we compared the difference between expected and observed patterns of SSD visits prior to the index diagnosis. First, we estimated the expected number of SSD visits by analysing the trend in the incidence of SSD visits in the time prior to the diagnostic-opportunity window, where missed opportunities are unlikely to occur (eg, τ−365 days prior to tuberculosis diagnosis). We then computed the expected number of visits in the diagnostic-opportunity window (eg, 3−τ days prior to tuberculosis diagnosis) by extrapolating the prior trend to the diagnostic-opportunity window. Second, we compared the observed pattern of SSD visits during the diagnostic-opportunity window to the expected number based on the extrapolated trend. Finally, the number of potential diagnostic opportunities was estimated by the excess number of SSD visits: the difference between the observed and expected number. This approach has been used in prior work to estimate the number of diagnostic opportunities associated with acute myocardial infarction, stroke and other cardiovascular events.33 To identify the point prior to the index diagnosis where diagnostic opportunities first begin to occur (ie, the diagnostic-opportunity window), we used a change-point analysis to detect the point where the trend between observed and expected number of SSD visits begins to deviate. We fit a piecewise regression model with a linear trend prior to the change-point τ and a cubic trend after the change-point, to account for the non-linear pattern in visit counts in the period just prior to diagnosis (see figure 2 for a depiction of this trend). We used the Akaike information criterion (AIC) to select the optimal change-point.

Figure 2

Trend in symptomatically similar diagnosis (SSD)-related healthcare visits prior to index tuberculosis diagnosis. (A) Left: Depicts the number of SSD-related visits each day prior to the index tuberculosis diagnosis summed across all patients and healthcare settings. Before the index tuberculosis diagnosis, there is a large spike in SSD-related healthcare visits. Online supplemental figures 1 and 2 provide similar counts of visits prior to the index diagnosis broken down by healthcare setting and type of SSD, respectively. Similar results are obtained for each healthcare setting and type of SSD. (B) Right: Depicts the same counts but adds trend lines for observed and expected visits. The red line depicts the trend in expected SSD-related visits, which was estimated using data prior to the change-point. The blue line depicts the trend in the observed number of visits after the change-point. The area between the blue and red lines depicts the number of SSD-related visits that represent likely diagnostic opportunities.

To estimate the number of individuals that experienced a potential diagnostic delay, number of recurrent missed opportunities per patient and the typical duration of delays, we used a bootstrapping approach similar to that of Waxman et al.33 Specifically, we randomly drew (with replacement) a sample of patients and re-estimated the observed and expected patterns of care. Next, at each period prior to the index tuberculosis diagnosis, we randomly labelled a portion of visits for the resampled patients as ‘diagnostic delays’ based on the computed excess number of SSD visits at that time period. Finally, we computed the number of patients that experienced a diagnostic delay, the number of recurrent missed opportunities per patient and the durations of the diagnostic delays. We repeated this procedure 25 000 times to compute 95% bootstrap-based CIs for the change-point τ, number of potential diagnostic opportunities, number of patients that experienced a diagnostic delay, number of recurrent missed opportunities per patient and the durations of the diagnostic delays.

Sensitivity analysis

Because diagnostic codes from administrative records may not capture all signs and symptoms present during a clinic visit (eg, in clinic notes), SSD-related ICD-9/10 codes may undercount the true number of visits representing a diagnostic opportunity. As a sensitivity analysis, we repeated our estimates of the incidence of diagnostic delays by including all visits that occurred within the diagnostic opportunity window (regardless of the presence of an SSD code). Specifically, we repeated the change-point and bootstrapping analysis described above using all visits prior to the index tuberculosis diagnosis.

Estimating risk factors for missed diagnostic opportunities

We analysed the potential risk factors for diagnostic delays by estimating the likelihood of a patient experiencing a missed opportunity on a given day prior to diagnosis. We treated diagnostic opportunities as a binary outcome—where a patient who has tuberculosis can experience either a missed opportunity (ie, SSD-related visit in the diagnostic-opportunity window [3, τ]) or a correct diagnosis (ie, the index diagnosis). Because multiple visits occurring on a single day likely represent a linked episode of care, for each day during the diagnostic-opportunity window or the index diagnosis date, we aggregated all visits containing an SSD or index diagnosis. We created indicators on each day for the specific type of healthcare facility (eg, inpatient, outpatient, ED). Days with an SSD-related visit during the diagnostic-opportunity window were assigned an outcome of 1 (ie, missed opportunity) and days representing the index tuberculosis diagnosis are assigned an outcome value of 0 (ie, correct diagnosis). We then used logistic regression to estimate the likelihood of a visit representing a missed opportunity, while controlling for other risk factors for delay.

We considered a number of patient-specific and context-specific risk factors for diagnostic delay. Patient demographics include age, sex and region (ie, urban vs rural). Environment-specific and setting-specific factors include the year and month of the SSD visit or the index diagnosis, whether visits during a given day involved inpatient, outpatient, or ED settings, or combinations of visits to multiple settings, and a term for tuberculosis incidence at patient location. Because many symptoms associated with pulmonary tuberculosis are similar to influenza-like illness (ILI), we created an indicator for peak influenza season based on the national level of outpatient ILI as reported by the CDC.37 ILI-based indicator values are provided in online supplemental table 2. Finally, we considered a number of clinical factors: indicators for asthma and COPD prior to the diagnostic-opportunity window were included as markers for pre-existing pulmonary conditions. In addition, indicators for a chest X-ray or a chest CT scan prior to the diagnostic-opportunity window were included because imaging may also indicate pre-existing pulmonary conditions. We also included an indicator for receipt of a fluoroquinolone prior to the delay window. We performed variable selection using backward elimination, evaluating model performance at each stage of the procedure using the AIC. SEs were used to compute Wald-type 95% CIs for the logistic regression analysis.

Patient and public involvement

No patients were involved.


From 2001 through 2017, a total of 5681 individuals had a tuberculosis diagnosis and an outpatient prescription drug claim consistent with treatment for active tuberculosis. The final study sample included 3371 enrollees that had been enrolled for at least 1 year prior to the index tuberculosis diagnosis. Figure 3 provides a flow diagram of inclusion criteria. Table 1 presents baseline criteria (age, sex, enrolment information and region) for the final study cohort.

Figure 3

Flow diagram of patient inclusion and exclusion criteria. Counts of patients excluded and reasons for exclusion used to identify the final 3371 index cases of tuberculosis.

Table 1

Baseline study population characteristics

Figure 2A depicts the pattern of SSD visits that occurred in the 1-year period prior to the index tuberculosis diagnosis. Online supplemental figure 1 depicts similar patterns for all visits and SSD visits broken down by type of healthcare setting and online supplemental figure 2 depicts trends for different categories of individual SSD diagnoses. Across nearly all settings and SSD visits, the pattern of SSD visits appears fairly stable, with a very gradual increase from 1 year up to around 100 days prior to the index diagnosis. Starting around 100 days prior to the index diagnosis, there is a dramatic spike in SSD visits.

Of the 3371 case patients we identified, 3306 (98.1%) patients had at least one healthcare visit in the year prior to their index tuberculosis diagnosis. Of these patients, 1134 (34.3%) had at least one inpatient visit, 1301 (39.4%) had at least one ED visit and 3297 (99.7%) had at least one outpatient visit. Focusing on visits with SSDs, we found 3084 (91.5%) patients had at least one SSD visit in the year prior to their index tuberculosis diagnosis. Over a third of all visits (37.2%) that occurred in the year prior to the index tuberculosis diagnosis involved one of the SSD conditions. The most common category of SSDs prior to index tuberculosis diagnoses was alternative cardio-sino-pulmonary-based diagnoses (2322 (68.9%) patients among 15 332 (17.6%) visits), followed by symptom-based diagnoses (2382 (70.7%) patients among 9086 (10.5%) visits), testing imaging or physical exam-based diagnoses (2123 (63.0%) patients among 8373 (9.6%) visits), and alternative infectious disease-based diagnoses (2129 (63.2%) patients among 7921 (9.1%) visits).

Since not all SSD visits represent diagnostic opportunities, we used a bootstrapping/simulation approach to estimate the number of likely diagnostic opportunities based on the observed and expected number of SSD visits prior to the index tuberculosis diagnosis. Our change-point analysis detected a significant increase in the number of SSD visits occurring 127 days (95% CI 117 to 138 days) prior to the index diagnosis; this represents the start of the diagnostic-opportunity window (ie, maximum duration of delay). Figure 2B summarises the observed and expected trend lines estimated from our change-point analysis. Across all patients, 2903 (86.1%) patients had at least one SSD during this diagnostic-opportunity window.

There was a total of 19 818 SSD visits that occurred during the diagnostic-opportunity window. Of these visits, based on our simulation analysis, we estimated that 10 118 (51.1%) represented a missed opportunity. We also estimated that approximately 528 (5.22%) missed opportunities occurred in inpatient settings, 9001 (88.96%) in outpatient settings and 589 (5.82%) in ED settings. Table 2 presents the estimated number of missed opportunities that each patient experienced. We estimate that 2602 (CI 2549 to 2652) or 77.2% (CI 75.6% to 78.7%) of patients experienced at least one missed opportunity prior to diagnosis. Of the patients who experienced at least one missed opportunity, we estimated that, on average, they experienced 3.89 (CI 3.65 to 4.14) visits representing missed opportunities, occurring in 3.46 (CI 3.24 to 3.69) outpatient visits, 0.20 (CI 0.19 to 0.22) inpatient visits and 0.23 (CI 0.21 to 0.24) ED visits.

Table 2

Estimated number of missed opportunities and duration of diagnostic delay based on simulation model

Table 2 also presents a breakdown of the estimated duration of diagnostic delays among patients who experienced at least one missed opportunity. The mean and median duration of delays were 31.66 (CI 28.51 to 35.11) days and 28.00 (CI 25.00 to 31.00) days, respectively. On average, patients who experienced at least one missed opportunity had a delay between first SSD and diagnosis of 41.00 days (CI 37.54 to 44.77), with 62.1% (CI 58.4% to 65.5%) of these delays lasting 30 or more days.

As a sensitivity analysis, we re-estimated the incidence and duration of diagnostic delays using all visits during the diagnostic opportunity window. In this case, the estimated diagnostic-opportunity window began 136 days prior to diagnosis. Across all patients, 3223 (95.6%) patients had a visit for any reason during this window. There was a total of 44 924 visits that occurred during the diagnostic-opportunity window. We estimated that 14 371 (32.0%) of these visits represented a missed opportunity and 2976 (CI 2923 to 3027) patients had at least one missed opportunity. On average, patients experienced 4.83 (CI 4.42 to 5.34) missed opportunities and had a delay duration of 45.71 days (CI 40.23 to 52.27).

Table 3 presents results of the logistic regression model estimating the likelihood of experiencing a potential missed opportunity during a visit on a given day. The likelihood of a miss was greater among individuals age ≥65 with an OR of 1.262 (CI 1.156 to 1.377). Patients with a history of asthma (OR 1.331 (CI 1.138 to 1.557)) or COPD (1.372 (CI 1.230 to 1.531)) were more likely to be delayed. Patients who had received chest imaging in the year prior to diagnosis but before the diagnostic-opportunity window were more likely to experience a miss (OR of 1.149 (CI 1.081 to 1.296) for chest CT and 1.231 (CI 1.121 to 1.353) for chest X-ray). Patients who received a fluoroquinolone in the year prior to diagnosis but before the diagnostic-opportunity window were more likely to experience a miss (OR 1.578 (CI 1.435 to 1.734)).

Table 3

Regression results for likelihood of experiencing a missed opportunity

Misses were more likely to occur during weekend visits (1.495 (CI 1.272 to 1.758)) and less likely to occur among patients in metropolitan locations (0.874 (CI 0.771 to 0.990)). Missed opportunities were more likely to occur in outpatient settings during periods of high influenza activity (1.259 (CI 1.052 to 1.507)). Missed opportunities were much less likely to occur in inpatient settings. Compared with outpatient settings alone, misses were less likely to occur on days involving only an inpatient visit (0.123 (CI 0.106 to 0.142)), both an inpatient and outpatient visit (0.124 (CI 0.105 to 0.145)), both an inpatient and ED visit (0.142 (CI 0.110 to 0.184)), or all three setting types (0.128 (CI 0.089 to 0.185)). Visits to the ED appeared to increase the odds of a miss. Compared with outpatient settings alone, misses were more likely on days when patients visited ED settings only (2.340 (CI 1.540 to 3.555)).


Our results show that the majority of patients diagnosed with pulmonary tuberculosis have multiple interactions with the US healthcare system prior to receiving a diagnosis consistent with active tuberculosis. Many patients present on multiple occasions, each representing possible missed opportunities to diagnose tuberculosis. Approximately 127 days prior to diagnosis, we observed an increase in visits for either symptoms associated with tuberculosis or diseases that share symptoms with tuberculosis. At least 90% of patients have at least one visit with either a code recording a symptom of tuberculosis or a disease that shares similar symptoms. Common diagnoses included pneumonia, respiratory infections and other pulmonary conditions. Diagnoses based on symptoms most frequently listed included fever, cough, haemoptysis and weight loss. A considerable proportion of patients experienced multiple visits representing missed opportunities to diagnose tuberculosis: 23.8% of patients had more than five possible missed opportunities.

We identified a number of risk factors for diagnostic delays. First, we found that delays are more common for patients who visited the ED, without an inpatient visit on the same day. Diagnostic errors may occur commonly in the ED setting: an estimated 12% of patients who revisit the ED do so because of an original misdiagnosis.38 In the ED, physicians are often treating patients they see for the first time and may be unaware of medical histories. In addition, many patients have vague symptoms and a range of severity.39 Also, ED physicians frequently care for multiple different patients concurrently. In one study, ED physicians were caring for a median of five patients at one time, and they were interrupted an average of 30.9 times during a 180-minute study period.40 Finally, when diagnostic errors do occur, ED physicians may not be able to learn from missed diagnostic opportunities because follow-up care occurs in other healthcare settings.

Additional risk factors that we identified included female sex and older age. Other studies have identified females as at higher risk for delays,8 41 42 and there is a need to investigate the cultural, biological or epidemiological factors responsible for this finding. Also, similar to the findings of others, we found that older adults are at increased risk for diagnostic delays.8 11 41 Older patients may be at greater risk because of more comorbidities or because they are less likely to exhibit some of the classic signs and symptoms of tuberculosis, perhaps due to the immunosenescence associated with ageing. In addition to female sex and older age, several investigations also highlight the risk of fluoroquinolone use for increasing diagnostic delays.43–45 Because fluoroquinolones have some antituberculosis activity, their inappropriate use prior to the diagnosis of tuberculosis (eg, to empirically treat a misdiagnosed bacterial pneumonia) may transiently improve symptoms.

In addition to established risk factors, our results highlight two novel risk factors for delay. First, we found that patients with a history of pulmonary diseases, specifically asthma or COPD, were more likely to experience a delayed tuberculosis diagnosis. Other groups have found that other comorbidities, especially pulmonary diseases, were associated with delays46; however, we also found that pulmonary imaging (prior to the risk window) was associated with delays. Prior history of pulmonary disorders is a risk factor because it creates a cognitive bias among clinicians. For patients with a history of asthma or COPD presenting with respiratory symptoms, it is less likely that tuberculosis may be considered as part of a differential diagnosis. While patients with a history of pulmonary imaging prior to the diagnostic window, presumably because of some long-standing pulmonary complaint, are more likely to experience a delay, delays are less common if patients received imaging during the diagnostic window because pulmonary imaging would help confirm a tuberculosis diagnosis. Our second novel finding is also related to cognitive bias. Interestingly, we found that if a patient presents during the influenza season, they are more likely to experience a delayed diagnosis for tuberculosis. Delays were also more common during periods of high ILI activity. This finding may reflect the fact that ILI symptoms and tuberculosis symptoms often overlap (eg, fever, cough), and clinicians may be more likely to suspect influenza during a period of increased activity.

Our study has a number of limitations. First, we use diagnostic codes to identify tuberculosis cases. While such codes have poor sensitivity for identifying active tuberculosis,30 we used medications to validate our case definition, an approach previously used for identifying tuberculosis.31 Second, we rely on claims data to determine the reason for visits prior to the tuberculosis diagnosis. Not all symptoms present during a visit are recorded in the insurance claim (eg, a patient visit for hypertension may also involve an unrecorded symptom of cough). Indeed, in our sensitivity analysis, the number and duration of diagnostic delays increased slightly when including all visits during the diagnostic-opportunity window, regardless of the presence of SSD-related diagnosis codes. In addition, some patients may have experienced diagnostic delays exceeding our detected opportunity window, who were not detected by our change-point algorithm because the volume of such visits is low. Thus, our results may underestimate the true number of visits that represent missed opportunities or the duration of longer individual delays. Third, our data do not contain race or ethnicity. Tuberculosis is much more common among immigrants and family members of immigrants. In other studies of low-incidence countries, delays were more common among non-immigrant populations.8 12 46 47 Fourth, our dataset is restricted to a privately insured population, with employer-sponsored health insurance and/or supplemental Medicare coverage. Thus, our findings may not be generalisable to an uninsured population or individuals with Medicaid coverage. However, vulnerable populations in inner cities or patients experiencing homelessness may be less likely to experience a delay.7 Finally, our study excluded extrapulmonary tuberculosis cases, and future work should focus on such cases given that they are at even greater risk for diagnostic delays.8 12 46

Despite our limitations, our results highlight the number of missed opportunities to diagnose tuberculosis. Risk factors for diagnostic delays include older age, female sex and living in a lower-incidence area. In addition, we identified new risk factors, including existing pulmonary conditions, previous pulmonary imaging and circulating influenza. These novel risk factors are directly related to cognitive biases that will need to be overcome to improve the timely diagnosis of tuberculosis.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Twitter @AlanArakkal

  • Contributors ACM: designed the study, developed the methodological approach, drafted and revised the final manuscript, and helped to obtain funding for the research. ATA: helped to conduct statistical analysis, helped draft the methods and results section, reviewed and revised the final manuscript. SK: helped to conduct statistical analysis, helped draft the methods and results section, reviewed and revised the final manuscript. JC: helped in developing the methodological approach, provided guidance on the statistical analysis, reviewed and revised the final manuscript. AKG: provided clinical expertise and feedback, helped review and revise the final manuscript, and helped to obtain funding for the research. DBH: provided clinical expertise and feedback, helped review and revise the final manuscript. PMP: helped conceive the study objective, provided clinical guidance, drafted and revised the final manuscript, and helped to obtain funding for the research.

  • Funding This work was supported by the Agency for Healthcare Research and Quality grant number 5R01HS027375, and PMP has received funding from the National Center For Advancing Translational Sciences of the National Institutes of Health under Award Number UL1TR002537.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data may be obtained from a third party and are not publicly available. The IBM Marketscan Research Databases can be obtained from IBM Watson Health. The code used for the simulation and statistical analysis is available on GitHub at

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.