Article Text

Original research
Association between primary care physician diagnostic knowledge and death, hospitalisation and emergency department visits following an outpatient visit at risk for diagnostic error: a retrospective cohort study using medicare claims
  1. Bradley M Gray1,
  2. Jonathan L Vandergrift1,
  3. Rozalina G McCoy2,
  4. Rebecca S Lipner1,
  5. Bruce E Landon3
  1. 1 Assessment and Research, American Board of Internal Medicine, Philadelphia, Pennsylvania, USA
  2. 2 Division of Endocrinology, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
  3. 3 Department of Health Care Policy, Harvard Medical School, Boston, Massachusetts, USA
  1. Correspondence to Dr Bradley M Gray; bgray{at}


Objective Diagnostic error is a key healthcare concern and can result in substantial morbidity and mortality. Yet no study has investigated the relationship between adverse outcomes resulting from diagnostic errors and one potentially large contributor to these errors: deficiencies in diagnostic knowledge. Our objective was to measure that associations between diagnostic knowledge and adverse outcomes after visits to primary care physicians that were at risk for diagnostic errors.

Setting/participants 1410 US general internists who recently took their American Board of Internal Medicine Maintenance of Certification (ABIM-IM-MOC) exam treating 42 407 Medicare beneficiaries who experienced 48 632 ‘index’ outpatient visits for new problems at risk for diagnostic error because the presenting problem (eg, dizziness) was related to prespecified diagnostic error sensitive conditions (eg, stroke).

Outcome measures 90-day risk of all-cause death, and, for outcome conditions related to the index visits diagnosis, emergency department (ED) visits and hospitalisations.

Design Using retrospective cohort study design, we related physician performance on ABIM-IM-MOC diagnostic exam questions to patient outcomes during the 90-day period following an index visit at risk for diagnostic error after controlling for practice characteristics, patient sociodemographic and baseline clinical characteristics.

Results Rates of 90-day adverse outcomes per 1000 index visits were 7 for death, 11 for hospitalisations and 14 for ED visits. Being seen by a physician in the top versus bottom third of diagnostic knowledge during an index visit for a new problem at risk for diagnostic error was associated with 2.9 fewer all-cause deaths (95% CI −5.0 to −0.7, p=0.008), 4.1 fewer hospitalisations (95% CI −6.9 to −1.2, p=0.006) and 4.9 fewer ED visits (95% CI −8.1% to −1.6%, p=0.003) per 1000 visits.

Conclusion Higher diagnostic knowledge was associated with lower risk of adverse outcomes after visits for problems at heightened risk for diagnostic error.

  • internal medicine
  • medical education & training
  • general medicine (see internal medicine)

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • Unique diagnostic knowledge measure linking diagnostic knowledge with adverse outcomes.

  • Scalable adverse outcome measures and extensive sensitivity analyses.

  • Our assessment of diagnostic error is indirect (as indicated by adverse outcomes).

  • Results are subject to selection bias if the mix of index visits or the severity of the patients or practice support differed for physicians with different levels of diagnostic knowledge.

  • Results are only generalisable to physicians who elected to attempt American Board of Internal Medicine’s certification exam and were about 10 years past initial certification and patients older than 65.


Diagnostic error has been identified as a key healthcare delivery concern and contributes to significant potentially preventable morbidity and mortality.1–3 Ambulatory care, and especially primary care, is a practice setting with a particularly high risk for diagnostic error4 5 because of the wide variety of presentations encountered and the concomitant difficulty of distinguishing harmful conditions from routine self-limited problems, compounded by the well-known time constraints faced by practitioners in that setting. It has been estimated that at least 5% of ambulatory visits are associated with diagnostic error, half of which may result in considerable patient harm. Diagnostic error is a common cause of malpractice suits and most frequently occurs in the ambulatory care settings.6 7

Deficiencies in diagnostic knowledge are likely to be an important contributor to these diagnostic errors that could impact, for example, the breadth of diagnoses considered, appropriate ordering and interpretation of tests and/or synthesis of data more generally.8–11 Because of this, measuring physician diagnostic knowledge has become a major focus of organisations throughout the developed world that are tasked with licensing and certifying physicians with the underlying, although largely untested, hypothesis being that diagnostic knowledge will be a measurable and strong predictor of diagnostic error.12–15 Testing this hypothesis and quantifying this relationship are therefore a critical public policy concern both in terms of the importance of board certification and other programmes designed to enhance lifelong learning for physicians.

In the USA, the American Board of Internal Medicine (ABIM) is a leading organisation that certifies primary care physicians, most notably general internists. In fact, most general internists in the USA are certified by the ABIM and these physicians represent about 45% of all adult primary care physicians in the USA.16 Unlike medical licensure, board certification is not a legal requirement to practice medicine in the USA, though many hospitals require board certification as one criterion to obtain privileges and insurers often require board certification to be included in covered physician panels.17 18 To maintain their certification, general internists must pass an initial certifying exam and, periodically, pass a recertification exam thereafter (referred to as Maintenance of Certification (MOC) exams).19 20 Diagnostic knowledge is a major component of these exams representing about half of all exam questions for the Internal Medicine MOC (IM-MOC) exam.

One explanation for the lack of research on this topic is the difficulty in studying the relationship between general diagnostic knowledge and diagnostic error because of the inability to quantify diagnostic knowledge and identifying diagnostic errors at a population level, especially in the outpatient setting.21 We address this gap in the literature by applying a unique measure of diagnostic knowledge, performance on diagnostic-related questions on ABIM’s IM-MOC exam, and relating this measure to deaths, hospitalisations and emergency department (ED) visits that occurred after outpatient visits for new problems at heightened risk for diagnostic error.


Physician and index visit sample

Our physician sample included general internists who were initially ABIM board certified in 2000 and took their IM-MOC exam between 2008 and 2011 (figure 1). We identified Medicare beneficiary outpatient Evaluation & Management visits with these physicians using their National Provider Identifier during the calendar year following their exam (2009–2012). These patients were age 65 or older and continuously enrolled in Medicare fee-for-service (Medicare insures most of the US population over 65) during the physician’s 1-year follow-up period and the year prior. To ensure that any presenting problems being evaluated were new (ie, not follow-up), we restricted these visits to those that were the first visit for a new problem (the ‘index visit’) because these visits were preceded by a 90-day clean period with no previous inpatient or outpatient visit. The 90-day clean period is consistent with the US government Centers for Medicare and Medicaid Services criteria used by its Bundled Payments for Care Improvement Programme for defining new episodes of care and with the patterns of visits we observed (see online supplemental appendix section 1 for related analysis).22 23

Supplemental material

Figure 1

Sample selection. MOC, Maintenance of Certification.

We further restricted these index visits to those at heightened risk for diagnostic errors because the recorded diagnosis in the Medicare claims (the ‘index visit diagnosis’), which includes recording of symptom (eg, loss of balance), could have been the initial presenting problem for one or more of 13 prespecified diagnostic error sensitive conditions such as congestive heart failure or bacteraemia/sepsis (see table 1). These 13 conditions (see online supplemental appendix section 2 for a list and applicable International Classification of Diseases, Ninth Revision codes (ICD-9 codes) were an acute non-cancerous subset of 20 conditions previously noted by Schiff et al to be at high risk for serious diagnostic error.24 For instance, index visits with diagnosis codes for chest pain, dyspepsia, shortness of breath, hypoxaemia/hypoxia, respiratory distress, weakness/fatigue, oedema or ascites could all be the initial presentation of congestive heart failure, which is one of the 13 diagnostic error sensitive conditions.

Table 1

Frequency of index visits related to each diagnostic error sensitive condition

We used a three-step process to identify eligible index visit diagnoses. First, two physician authors (RGM and BL) identified all diagnoses that could be presenting problems for the 13 diagnostic error sensitive conditions: what problems/diagnoses might someone who ultimately presented with a diagnostic-error sensitive condition have presented with initially? Second, because the original list of identified index visit diagnoses was large (76), we reduced this list to 38 by applying a relative risk (RR) criteria. For a specific index visit diagnosis to meet this criteria, all index visits with that diagnosis had to have a greater portion of later ED visits or hospitalisations with the related outcome condition discharge diagnosis than index visits where the specific at risk diagnosis was not present. For example, dizziness was chosen as an eligible index visit diagnosis for stroke, one of the diagnostic error sensitive conditions, both because it was identified as a potential presenting symptom of a stroke by physician authors and because index visits with that diagnosis had a greater proportion of later hospitalisation or ED visits for stroke than visits without this diagnosis. Third, we also included index visits where the actual diagnosis was one of the 13 diagnostic error sensitive conditions because we wanted to include cases where diagnostic errors were and were not made. Therefore, we also included index visits with a diagnosis of congestive heart failure itself as being at risk for the underlying condition congestive heart failure.

Outcome measures

We examined the risk of three serious adverse outcomes within 90 days of the index visit that we hypothesised would occur more frequently in cases of misdiagnosis: all-cause mortality, hospitalisations and ED visits. We did not count these events as adverse outcomes if they occurred on the same day as the index visit because this may reflect a positive action (the physician correctly diagnosed a patient with stroke and referred/admitted them to the hospital) or be unavoidable regardless of the accuracy of the index visit diagnosis (the patient died despite immediately admitting the patient to the hospital who exhibited stroke symptoms). Based on Medicare billing codes, hospitalisations were limited to non-elective hospitalisations initiated through the ED or trauma centre. The ED and hospitalisation outcomes were also limited to cases where the discharge diagnosis was for one of the 13 diagnostic error sensitive conditions following an index visit with the applicable diagnosis. We therefore presumed that these discharge diagnoses were a reasonable representation of the underlying condition of the patient at the time of the index visit. For example, we would count a hospitalisation with a discharge diagnosis of stroke as an adverse outcome if it occurred after an index visit for dizziness because dizziness was identified as being a potential presenting problem for stroke. However, we did not count hospitalisations with a discharge diagnosis for acute coronary syndrome following an index visit for dizziness because dizziness was not identified as a presenting problem for acute coronary syndrome. The rationale is that if there were no presenting problems during the index visit related to coronary syndrome, either because the underlying condition was not present or could not be detected at the time of the index visit, then the index visit physician could not have prevented the hospitalisation regardless of their diagnostic knowledge.

Measure of diagnostic knowledge

Our measure of diagnostic knowledge was calculated as the per cent of correct answers on the IM-MOC exam for questions previously coded as ‘diagnosis-related’ by ABIM’s IM-MOC exam committee. In our study, these questions comprised 53% of all IM-MOC exam questions, with the remaining 42% addressing treatment and 5% related to other topics such as epidemiology or pathophysiology. More generally, exam questions are designed to replicate real world clinical scenarios and/or patient encounters and without reliance on rote memorisation.25 26

The ABIM exam committee coded each question based on the primary function tested to assure that the exam covers care typically rendered by outpatient primary care physicians. Questions coded as diagnosis related typically test knowledge and skills related to diagnostic inference, differential diagnosis and diagnostic testing and therefore are measuring diagnostic knowledge and related decision-making. Psychometric analysis indicates that scores on diagnosis related exam questions were meaningfully correlated (ie, Cronbach’s alpha score of 0.84), and thereby represent an independent underlying construct that could be interpreted as diagnostic knowledge (see online supplemental appendix section 3 for more details).27 Similarly, this analysis indicated that questions coded as treatment related also represent an independent underlying construct (ie, Cronbach’s alpha score of 0.75). Although performance on diagnosis and treatment related questions were correlated (Pearson correlation=0.62), 59.5% of the variation in diagnosis exam performance for the physician study sample was not explained by performance on other parts of the exam.

Statistical methods

Using Probit regression, we estimated the associations with each adverse outcome, with standard errors adjusted for correlations resulting from the nesting of visits within patients within physicians.28 29 To measure associations with diagnostic knowledge, we included categorical regression explanatory variables for top and middle third of per cent correct scores on diagnosis-related questions (bottom third was the reference category). Other exam level explanatory variables included tertile indicators for performance on treatment-related questions and performance on other question types. Since these variables measure knowledge unrelated to diagnosis, they account for correlations between factors such as unmeasured practice or patient characteristics that might be correlated with exam performance and our outcome measures (eg, high scoring physicians may be more likely to practice in an academic setting or other such settings that might be independently related to diagnostic error). Exam form indicators accounted for differences in exam difficulty across exam administrations.

We also included physician, patient and visit level regression controls. Physician level controls included: practice size (indicators for solo practice and practices larger than 50 physicians), practice type (indicators for academic, group), demographic (gender) and training characteristics (medical school location interacted with country of birth). Patient level controls included: demographic characteristics (age and age squared, gender and race/ethnicity indicators) and a Medicaid eligibility indicator. Lagged patient risk adjusters included 27 indicators for chronic conditions and Medicare’s Hierarchal Condition Category (HCC) risk adjustment score. We imputed values for a small number of missing values for controls (see online supplemental appendix section 4). Patient index visit location level controls included: an indicator for residing in a rural ZIP code, ZIP code median household income, and indicators for 10 US Health and Human Services regions. Index visit level controls included: indicators of any outpatient visit, hospitalisation or ED visits within the prior year and number of days since the most recent of these events, visit year indicators to control for secular changes in quality. We also included an indicator for whether or not the patient had a previous contact with the index visit physician during the year prior to the index visit to account for differences in physician–patient continuity (see online supplemental appendix section 5 for a full list of controls).

Sensitivity analysis

We performed numerous sensitivity analyses to test the robustness of our results (detailed in online supplemental appendix section 6). First, we expanded the index visit sample to include all index visits with the original 76 diagnoses identified by the physician authors regardless of whether they met the RR criteria. Second, we expanded and contracted the index visit clean period by 7 days. Third, excluded hospitalisations or ED events occurring the day after the index visit, in addition to same day events, to consider the possibility that they might be triggered by a correct diagnosis and therefore should not have been considered adverse outcomes. Fourth, we considered the possibility that our results were biased due to omitted variables correlated with practice size. For example, it could be that physicians in large practices have greater access to specialists or other physicians for informal consultations than those is small practices and therefore outcomes for these physicians may be less sensitive to their knowledge. To examine this possibility, we estimated associations with knowledge and our two utilisation measures across a sample of physicians in either small (≤10 physicians, 54.5% (768/1410) of physicians) or large practices (>50 or in academic medical centres, 23.7% (334/1410) of physicians). We did not conduct these sensitivities for death because there were too few deaths in the subgroups to allow us to reliably estimate the associations (eg, 39 deaths for physicians in large practices). Fifth, to consider the possibility that these outcomes were only avoided because the patient died, for the ED and hospitalisation outcome, we also included instances where the patient died. Sixth, as a falsification test we limited the index visits to those that were unrelated to the 13 diagnostic error sensitive conditions. Under this sensitivity, we expected then that the associations with diagnostic knowledge would decline. The index visit physician’s diagnostic knowledge cannot impact a future adverse outcome if the underlying condition that caused that outcome was not present or detectible at the time of index visit. Therefore, this reduction in association should be especially true for the hospitalisation and ED measures where adverse outcomes were limited to the 13 diagnostic error conditions and so were unrelated to the index visit diagnoses in this sensitivity. Similarly, for the last sensitivity, we applied elective hospitalisations as an outcome measure to consider the possibility that there could be a correlation between the overall propensity to hospitalise in an area and physician knowledge.

All analyses were performed using Stata V.15.

Patient and public involvement

Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.


Of 2492 general internists who initially certified in 2000 and who took an IM-MOC exam between 2009 and 2012 and 1722 had outpatient visits with a fee-for-service Medicare beneficiary during the study period. Those without visits generally practised hospital medicine. Of these, 1410 were included in the study because they had at least one outpatient index visit that met our study inclusion criteria during the year after they took their IM-MOC exam. In total, 48 632 index visits with 42 407 patients treated by 1410 physicians met study inclusion criteria (figure 1). Table 1 lists frequency of index visits and subsequent outcomes for each diagnostic error sensitivity condition.

The mean per cent correct on diagnosis questions ranged from 84.3% among top third performers to 65.5% among bottom third performers (table 2). Patient and visit characteristics were similar across tertiles of physician diagnostic knowledge. For example, there were no statistically significant differences in the HCC risk adjuster across tertiles (p=0.19) However, there were differences in some physician and practice characteristics. When compared with physicians in the bottom tertile of diagnostic knowledge, physicians in the top were significantly less likely to be in solo practice (12.8% vs 24.4%, p=0.009), and more likely to be in academic practice (9.7% vs 3.4%, p<0.001). However, the proportion graduating from a US medical school was similar across diagnostic knowledge tertiles (70.0% vs 63.3%, p=0.30).

Table 2

Physician and patient characteristics by diagnostic exam performance tertile

Associations between diagnostic knowledge and patient adverse outcomes

The overall rates of 90-day adverse outcomes per 1000 index visits were 6.5 for death, 11.1 for hospitalisations and 13.6 for ED visits (with the latter two directly associated with one of the diagnostic error sensitive conditions whose antecedent was present in the applicable index visit). Being seen by a physician scoring in the top versus bottom third of diagnostic knowledge on the MOC exam was associated with 2.9 fewer deaths per 1000 visits (95% CI −5.0 to −0.7, p=0.008) which reflects a 35.3% lower risk of death (95% CI −52.8 to −11.2, p=0.008, table 3). Our finding also suggests that this difference in exam performance was associated with 4.1 fewer applicable hospitalisations (95% CI −6.9 to −1.2, p=0.006), and 4.9 fewer applicable ED visits (95% CI −8.1 to −1.6, p=0.003) per 1000 visits (table 3). These reductions correspond with about a 30% lower risk for these utilisation measures (hospitalisations: −30.5%, 95% CI −46.1 to −10.4, p=0.003, ED: −29.8%, 95% CI −44.4 to −11.4).

Table 3

Associations with diagnostic knowledge and adverse events per 1000 index visits

We also found a significant knowledge tertile dose response relationship across all three regression adjusted RR measures (p-trends<0.008). For example, the regression-adjusted 90-day risk of death per 1000 patients whose index visit physician scored in the top third of diagnostic knowledge was 5.2 (95% CI 4.1 to 6.3), compared with 6.5 (95% CI 5.4 to 7.6) for the middle third, and 8.1 (95% CI 6.5 to 9.7) for the bottom third (p-trend=0.008).

Sensitivity analyses

Our sensitivity analyses (online supplemental appendix section 6) confirmed that base case associations with diagnostic knowledge were robust to different index visit clean periods, and diagnosis code inclusion criteria and next day coding of outcome measures. Associations with diagnostic knowledge were also fairly robust to physician’s practice size for both the ED and hospitalisation measures when we limited the sample to either small or large or academic practices.

Suggesting that our results were not influenced by omitted variable bias, we found that associations with diagnostic knowledge and our outcome measures became small and statistically insignificant when we limited the sample to index visits with diagnoses unrelated to any of the 13 diagnostic sensitive error conditions, and so were at lower risk for diagnostic error (p>0.50 and associations were at most about a tenth of the base case per cent difference between top and bottom third of diagnostic knowledge). We also found no significant association between lack of diagnostic knowledge and elective hospitalisations (p=0.63).


We found that higher diagnostic knowledge among US outpatient internal medicine physicians was associated with significant reductions in subsequent adverse outcomes whose cause was at risk for diagnostic error. Indeed, for every 1000 index visits for a new problem at risk for diagnostic error, being seen by a physician in the top versus bottom third of diagnostic knowledge was associated with 2.9 fewer all-cause death and, for diagnostic error sensitive conditions, 4.1 fewer hospitalisations and 4.9 fewer ED visits within 90 days. These figures correspond to a reduction in risk for these adverse events by about a third. Although some prior studies have demonstrated the high morbidity and mortality of diagnostic error,1–3 this is the first study to demonstrate and quantify the direct association between serious adverse outcomes and the diagnostic knowledge of their first contact primary care physician. These finding support the notion that gaps in diagnostic knowledge between physicians may be an important contributor to the diagnostic error problem plaguing the healthcare system worldwide.

We measured the association between diagnostic knowledge and potential diagnostic error by using Medicare claims data to identify patients who presented for outpatient visits with problems at heightened risk for serious diagnostic errors and examining the occurrence of clinically relevant adverse outcomes soon thereafter. Although this approach lacks the precision of individual chart audits,7 it is both clinically plausible and scalable in that it can be used to monitor the care of large numbers of patients, making the method itself an important contribution to the literature on diagnostic error. Although we did not directly measure diagnostic errors through chart audits, the fact that we found associations with diagnostic knowledge and the diagnostic error sensitive outcome conditions we studied coupled with the fact that we did not find associations with treatment knowledge, nor did we find associations when the underlying diagnostic error sensitive condition was likely not present during the outpatient index visit because no antecedent diagnoses recorded indicates that the associations we report in this study were likely driven by association with diagnostic errors that occurred during these visits. Furthermore, our approach builds on prior studies that used claims data to infer diagnostic error incidence for ED visits, in that we identified index visit diagnoses at risk for diagnostic error that were clinically plausible and verified empirically, and we assured that we were studying new problems by requiring that the patient had not had an ED, hospital or outpatient visit over the previous 3 months.30–32 We expanded on these studies by focusing on outpatient care and by examining a much more comprehensive set of presenting problems that may have been precursors to one of 13 diagnostic error prone conditions that we studied. This approach was necessary in order to study diagnostic error in the more low acuity setting of outpatient general internal medicine.

Our findings suggest an association between diagnostic knowledge and adverse outcomes. Yet, there are important limitations to consider. We did not directly determine whether a diagnostic error had occurred through such validated means as a chart review. Our findings cannot be interpreted as causal given the cross-sectional nature of our study so we cannot rule out the possibility that observed associations were the result of omitted variable bias related to either physician or patient characteristics, and do not reflect a causal relationship between diagnostic knowledge and adverse outcomes. That said, there is no reason to believe that these characteristics would be correlated with diagnostic knowledge independent of treatment knowledge which we were able to control for as both these knowledge measures should be similarly correlated with unobserved factors such as ability of consulting colleagues. Furthermore, had associations with diagnostic knowledge been driven by omitted variable bias then we would have expected them to be similar when estimated across index visits with lower or higher risk for diagnostic error, and they were not. We also found that diagnosis exam performance was not associated with elective hospitalisations, which are, presumably, unrelated to underlying diagnostic knowledge but may be related to the overall propensity to hospitalise. That said, the fact that practice size was found to be correlated with diagnostic exam performance is concerning. For example, as described above, practice size could be correlated with access to specialists that in turn might be related to our outcome measures. However, sensitive analyses indicate that associations with knowledge and our utilisation adverse outcome measures were fairly similar across physicians practice size/type (small, and large or academic). An additional limitation is that we studied select conditions among older patients enrolled in the Medicare programme so we cannot extrapolate these findings to a younger population, other conditions we did not consider, or populations with no or different health insurance coverage. Our findings might also not be applicable to older physicians who certified before 2000 or younger physicians who certified after 2000 as well as physicians who choose not to attempt an exam. While a physician’s clinical knowledge might be related to their decision to not take the MOC exam therefore not maintaining their certification, other factors certainly play a role in this decision.

Another limitation of our study is that the IM-MOC exam was specifically designed to measure clinical knowledge in general, it was not designed to measure diagnostic knowledge specifically. That said, diagnostic knowledge is a major component of the exam and was found to meet the criteria for measuring this underlying construct. Also diagnostic error may have stemmed from factors outside of inadequate diagnostic knowledge, which are not covered by the exam but could be correlated with our exam based diagnostic knowledge measure (eg, poor patient/physician communication skills and related system failures).33 34 That said, there is no reason to believe that these other contributors to diagnostic error would not also be correlated with the other aspects of the exam we do account for. Furthermore, based on an analysis of malpractice claims, Newman-Toker et al 6 reported that clinical judgement played an important role in 86% of diagnostic errors, while poor patient/physician communication and system failures played a role in far fewer diagnostic errors that resulted in malpractice suits (35% and 22%, respectively). Suggesting that improving communication will not reduce stroke related diagnostic error, Kerber and Newman-Toker35 reported that frontline providers rarely ask the right questions when patients present with dizziness. Communication ability is only valuable in terms of reducing diagnostic error if the physician knows what questions to ask and what the answers mean. Although we cannot say with certainty that our finding is driven by an underlying association between diagnostic knowledge and diagnostic errors, at a minimum, our finding suggests that patients treated by physicians who scored well on diagnostic exam questions may be at lower risk for the adverse outcomes we studied. Finally, some might assert that a standardised exam without access to medical reference material might be more a reflection of a physician’s rote memory and ability to recall medical facts than a test of their clinical knowledge and judgement. Although this is a fundamental limitation of our study, it should be noted that the exam is designed to mimic decision making in real life situations including such things as patient’s laboratory results and reference material impeded in the exam and past research indicates that an ‘open’ book format that allows physicians access to reference material did not materially impact exam performance.36 It should also be noted that the necessary rapidity of decision-making by primary care physicians who have limited time per encounter might fairly be represented by an exam with time constraints.

In this exploratory analysis, we found evidence that diagnostic knowledge of primary care physicians seeing a patient for an index visit for a problem that is at heightened risk of diagnostic error is associated with adverse outcomes. The fact that there exists a link between general diagnostic knowledge and diagnostic error may not be surprising, the magnitude of the associations we found suggests that interventions ignoring the role of physician knowledge may be inadequate to address the crisis of diagnostic error. Interventions targeted at improving diagnostic knowledge could include such things as a greater focus on diagnostic training during graduate medical education (ie, medical school, residency and fellowship). Knowledge-focused interventions could also include incentivising broad-based learning as well as targeted learning pursued through continuing medical education activities.30 During visits identified as being at risk for diagnostic errors, physicians could be given related information at the point of care including suggestions for specialty consultation.

Our results are important for two additional reasons. First, these results provide evidence that board certification and maintenance of certification, which involves lifelong learning directed at maintaining medical knowledge, might, in fact, be a valid approach to assuring the delivery of high-quality care. Many in the USA report problem about the time and expense of MOC and often point to the lack of rigorous assessment between aspects of MOC and outcomes of interest to patients. These findings suggest that processes such as MOC may translate into meaningful improvements in outcomes because they can provide incentives for meaningful learning. This learning also could be enhanced through exam feedback targeted at diagnostic knowledge. Second, the findings also suggest that interventions aimed at improving diagnostic skills, whether knowledge-based or through, for instance, delivery of relevant information at the point of care (this is in response to system changes) might be approaches that might be worthwhile if the findings of this study are validated with additional research. Yet more research is needed to better understand the link between diagnostic knowledge and diagnostic errors that are identified through chart review or other methods of direct ascertainment and the extent to which such errors result in adverse clinical outcomes.

In conclusion, gaps in diagnostic knowledge among first contact primary care physicians are associated with serious diagnostic error sensitive outcomes. If this finding is confirmed in future studies, diagnostic knowledge should be a target for interventions to reduce diagnostic errors.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Contributors All authors substantially contributed to the conception and design of the work. BG, JLV and RSL contributed to the acquisition of the data. All authors substantially contributed to analysis or interpretation of data. All authors substantially contributed to the drafting the work and revising it critically for important intellectual content. All authors gave final approval of the version published. All agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work were appropriately investigated and resolved.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests BG, JV and RL are paid employees of the American Board of Internal Medicine. BL is a paid consultant for the American Board of Internal Medicine.

  • Patient consent for publication Not required.

  • Ethics approval Advarra Institutional Review Board approved our study protocol. Continuing review number (CR00144650) and the protocol number (Pro00026550).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data may be obtained from a third party and are not publicly available. Administrative data describing physician characteristics and exam performance can be obtained from the ABIM through a data sharing agreement that assures physician confidentiality and its use for legitimate research purposes. Access to deidentified Medicare claims data for this study were obtained through a special data use agreement with the Centers for Medicare and Medicaid services which is a process available to researchers in the US.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.