Article Text


Now you see me: a pragmatic cohort study comparing first and final radiological diagnoses in the emergency department
  1. Björn Mattsson1,
  2. David Ertman2,
  3. Aristomenis Konstantinos Exadaktylos2,
  4. Luca Martinolli3,
  5. Wolf E Hautz2
  1. 1 Departement des urgences, Hôpital neuchâtelois, Neuchâtel, Switzerland
  2. 2 Department of Emergency Medicine, Inselspital (University Hospital of Bern), Bern, Switzerland
  3. 3 Notfallstation, Privatklinik Linde, Biel, Switzerland
  1. Correspondence to Dr Wolf E Hautz; wolf.hautz{at}


Objectives To (1) compare timely but preliminary and definitive but delayed radiological reports in a large urban level 1 trauma centre, (2) assess the clinical significance of their differences and (3) identify clinical predictors of such differences.

Design, setting and participants We performed a retrospective record review for all 2914 patients who presented to our university affiliated emergency department (ED) during a 6-week period. In those that underwent radiological imaging, we compared the patients’ discharge letter from the ED to the definitive radiological report. All identified discrepancies were assessed regarding their clinical significance by trained raters, independent and in duplicate. A binary logistic regression was performed to calculate the likelihood of discrepancies based on readily available clinical data.

Results 1522 patients had radiographic examinations performed. Rater agreement on the clinical significance of identified discrepancies was substantial (kappa=0.86). We found an overall discrepancy rate of 20.35% of which about one-third (7.48% overall) are clinically relevant. A logistic regression identified patients’ age, the imaging modality and the anatomic region under investigation to be predictive of future discrepancies.

Conclusions Discrepancies between radiological diagnoses in the ED are frequent and readily available clinical factors predict their likelihood. Emergency physicians should reconsider their discharge diagnosis especially in older patients undergoing CT scans of more than one anatomic region.

  • diagnostic error
  • quality in health care

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Strengths and limitations of this study

  • Retrospective record review of a real-world patient sample.

  • Clinically valid comparison between immediate first and delayed final radiological diagnosis.

  • Single-centre study, situated in a large urban emergency room, where many diagnoses are first made.

  • Designed to identify readily available predictors of misdiagnosis such as age, imaging modality and anatomic region.

  • Unable to determine long-term consequences due to retrospective design.


Annually, between 100 000 and 250 000 patients in the USA alone die from medical errors.1 2 Diagnostic errors are a frequent and the most consequential medical error,2–6 and misdiagnosis thus is one of the greatest concerns for patients in the emergency department (ED).7 It furthermore has important economic and legal consequences.8 Errors in the assessment of radiographs are a potential source of such diagnostic errors. Especially in the ED, diagnostic errors might lead to iatrogenic harm to the patient.9

In most EDs, plain film radiographs (X-ray) are initially interpreted by the treating emergency physician (EP), and a definitive diagnosis by a radiologist is provided hours to days later. More complex examinations, including CT scans or MRI, are often interpreted immediately by a (junior) radiologist on duty and findings communicated to the EP,10 while a senior’s definitive approval of such reports might follow much later. Often, EPs additionally informally consult radiologists on duty on findings the EPs are uncertain about. Thus, two interpretations of radiographs typically exist in most EDs: an immediately available reading by EPs and potentially junior radiologists and a delayed but more reliable reading by senior radiologists. Treatment and discharge decisions in the ED are typically based on the former due to the time constraints in most EDs.

Previous studies have shown overall discrepancy rates in the interpretation of radiographic images between radiologists and EPs to range between 1.1%11 and 9.2%,12 although much higher discrepancy rates have been reported for specific types of examinations.13 However, missed radiological findings that would have resulted in an immediate change in the management of a patient have been reported to be exceedingly rare.14

Whereas differences in the interpretation of radiographic images between radiologists and EPs have been extensively researched, the discrepancies between preliminary results reported in the EDs discharge letter and the definitive radiology report are less well examined. We thus aimed to compare the preliminary findings reported by the ED with the definitive radiological reports and determine the clinical significance of any differences in order to estimate the resulting degree of consequential diagnostic errors. We further aimed to model the binary outcome discrepancy/no discrepancy based on clinical data readily available to the EP before discharge to provide the EP with an a priori estimate of the probability of error.

Patients and methods

This study is a retrospective review of all radiological studies ordered between December 2012 and January 2013 in a large urban academic ED and level 1 trauma centre that saw approximately 38 000 patients in 2013. The ED is staffed by physicians certified in internal medicine, surgery, traumatology and emergency medicine.15 We retrieved records of all adult patients presenting with traumatic or non-traumatic injury, medical or neurological chief complaints during the study period. Of these patients, we included all those for whom radiological studies had been ordered. Patients consulting directly with specialist clinics (orthopaedics, neurosurgery, hand surgery, plastic surgery, nephrology and urology) for non-urgent reasons were excluded since the procedures of how and when radiological findings are reported to the requesting physicians differ strongly depending on requesting departments. All relevant data, including age, gender, time of day, diagnosis and clinical management as noted in the discharge documents were retrieved from the ED patient management system (ECare ED V.; bvba, Turnhout, Belgium) and entered into a database (Microsoft Excel 14.0; Microsoft, Redmond, Washington, USA). Definitive radiological reports were retrieved from our digital radiological database (Spectra Workstation IDS 7; Sectra AB, Linköping, Sweden) and imaging modality categorised as either X-ray, CT, MRI, ultrasound (US) or scintigraphy (SCI). We further coded the body part examined as either head (including face and neck), chest, abdomen, skeletal system or other. The total number of imaging studies in each category was recorded.

We subsequently analysed the preliminary radiological report as given in the ED-discharge documents and compared them with the definitive radiological report, which was defined as the gold standard. Two independent reviewers analysed the data set and noted discrepancies between preliminary and definitive findings. Discrepancies were subsequently categorised by two independent EPs in duplicate as either ‘clinically significant’, that is, changing clinical management or ‘clinically insignificant’. Rater agreement was calculated as Cohen’s kappa and disagreements resolved by discussion. More than one abnormal finding may be present in any radiographic imaging, especially in patients with eg known comorbidities or after major trauma. Whenever rater encountered a discrepancy between the first and final radiological report, we counted this as a discrepancy. However, each image with a discrepancy was counted only once, regardless of the total number of discrepancies present on that image, leading to a conservative estimate of the total number of discrepancies. For each discrepancy identified per image, the clinical relevance was assessed separately but again counted only once if present.

Statistical analysis was conducted in SPSS V.22 (IBM, Armonk, New York, USA) and included descriptive statistics (frequency, mean and SD) and a logistic regression model. With the regression, we aimed to predict the binary outcome discrepancy versus no discrepancy between the radiological studies based on the patient’s age, gender, the imaging modality, the anatomic location of the radiological study and the time of day. Metric predictor variables (age and time of day) were z-standardised prior to the analysis to ensure comparability within the model. We refined the model stepwise by removing all non-significant predictor variables and report Nagelkerke’s R2 as measure of the models fit together with P values from Wald statistics and the respective regression coefficients. P values <0.05 were considered significant. We planned to assume data to be missing at random and thus impute missing data by means of a maximum likelihood estimation. We did however not encounter any missing data in the variables assessed in this study, which may result from the fact that we only retrieved very basic patient data such as gender or age.


In the 6-week study period from 1 December 2012 to 15 January 2013, a total of 2914 patient visits were recorded in the ED. Of these, a total of 1522 patients, which corresponds to just over half (52.0%) of all patients, had at least one radiological study taken and were thus included in the study. On presentation, 608 of these patients had been triaged as surgical, 544 patients as medical and 360 patients as neurological emergencies. A majority of patients were men (n=868, 57.0%), and the median age was 53.74 years (minimum 16, maximum 98, SD 20.9). The majority of studies were ordered during daytime between 07:00 hours and 20:00 hours (n=1086) (table 1).

Table 1

Total number of patients, overall and clinically significant discrepancies

A total of 1875 radiological studies were performed, including 776 X-ray, 680 CT, 367 MRI, 49 US and 3 SCI. Due to their small number, SCIs were excluded from further analysis. The most common radiological studies ordered were CT of the head and neck (n=343), X-ray of the chest (n=319) and MRI of the head (n=329).

Rater agreement on whether or not discrepancies between discharge report and final radiological report were clinically significant was substantial (kappa=0.86). Overall, 381 discrepancies (20.35%) were found, of which 149 (7.48%) were judged to be clinically significant (table 2).

Table 2

Number of radiological studies, overall and clinically significant discrepancies classified according to type of radiological study

An example for a discrepancy judged as not clinically relevant is the CT scan of the head of patient 27, who was found unconscious. The radiographic report of the ED documents ‘no pathologies in contrast enhanced CT scan of the head’, while the final report points to ‘no explanation for acute unconsciousness identifiable, signs of chronic sinusitis’. The patient was found to be intoxicated with mixed substances. A relevant discrepancy, for example, was identified in patient 51, who presented with an acute abdomen due to a perforated sigmoid diverticulitis . While the ER’s report mentions this diagnosis of the CT of the abdomen, it fails to mention the infiltrate in the lower sections of the left lung, which the final report identified.

Whether or not a discrepancy between an initial radiological assessment and the definitive report by the department of radiology did or did not occur was predicted by several clinical variables. Logistic regression identified patients’ age, modality of imaging and anatomic region of the radiological study to be significant predictors (all P values <0.05), while time of day and patient gender had no significant predictive value. The model fits the data fairly well (R2=0.112) and would correctly predict outcome on 77.8% of the cases. Details of the model are given in table 3.

Table 3

Results of the refined logistic regression model to predict a discrepancy between emergency department’s discharge report and definitive radiological report based on clinical characteristics

Figures 1 and 2 show the change in probability of a discrepancy predicted by the regression model based on age, imaging modality and anatomic region of the study.

Figure 1

Probability of a discrepancy between first and final radiological diagnosis depending on body part over patient’s age.

Figure 2

Probability of a discrepancy between first and final radiological diagnosis depending on image modality over patient’s age.


Radiological images are an important part of medical diagnosis. In many EDs, patients’ radiographs are initially assessed by ED physicians as well as junior radiologists and treatment is determined based on their joint interpretation. A more definitive interpretation by senior radiologists is typically only available with a considerable delay.

Comparing the interpretation of radiographs in the discharge letter of ED patients to the final report from radiology, we found an overall discrepancy rate of 20.35%. Slightly more than one-third of these (7.48% overall) were deemed clinically relevant by two independent expert raters. The estimates of error from our study are well within the range of previous publications,11 12 14 16–18 which however mainly compared EPs’ reading of radiographs to senior radiologists. Such a comparison is only directly applicable to very small EDs as larger centres such as the one under investigation in our study typically have at least a junior radiologist on duty around the clock. Thus, the study reported here extends previous findings to the clinical reality in tertiary centres. A previous review of diagnostic error in medicine in general found the rate of critical discrepancies between a first and a second reading of images in visual specialties such as radiology, dermatology or pathology to range between 2% and 5%,19 just below the rate of discrepancies the raters deemed clinically relevant in our study.

Using a linear regression to model the likelihood of a discrepancy between first and final radiological diagnosis, we found several readily available clinical factors to be predictive of an error. These factors namely are patient age, imaging modality and region of the body under investigation. The factors are both plausible from a clinical perspective as well as in line with the sparse previous findings on the issue. Age has been previously found to be associated with diagnostic error20 and adverse events in the ED,21 likely because radiographs become harder to interpret in the presence of age-related or chronic findings.

We further found imaging modality and region of the body under investigation to be predictive of a discrepancy. From a clinical as well as a mathematical perspective, it is plausible that both more than one modality as well as more than one body region under investigation increase the likelihood of a discrepancy. Furthermore, two well-known cognitive sources of error are premature closure, that is, the failure to consider alternative diagnoses22 as well as satisfaction of search, that is, the termination of a diagnostic search after successful identification of one pathological finding.23 Both phenomena are less likely to occur with increasing expertise on a subject.24 Consequently, some authors have argued that the interpretation of any medical image should be exclusively left to experienced radiologists,25 while others argue that non-radiologists should simply be better trained,26 especially given the increasing availability of radiographic imaging.

One counterintuitive finding at first sight is the rather low discrepancy rates in MRIs of the head as well as in patients triaged as neurological emergencies. We assume these findings to be related because most MRIs of the head are ordered in patients with neurological chief complaints. One reason why the discrepancy rate in these patients is rather low may be the fact that neurologists are highly trained in interpreting cerebral MRI.27 Furthermore, the variety of possible interpretations is lower in cerebral MRIs than in a patient population with highly diverse body regions under investigation commonly triaged as medical or surgical chief complaints. Also, the likelihood of a coincidental finding in an MRI of the head, that is not related to the ER presentation and thus not actively searched for, is likely smaller than in, for example, a CT scan of the abdomen, where there simply is more to see and therefore a higher probability of an abnormality.

Our study has several limitations. First, this is a retrospective study susceptible to both documentation bias and hindsight bias.28 Prospective studies of diagnostic error are imperative and currently ongoing.29 Second, our study design does not allow us to discern whether the discrepancies identified between the final radiological report and the findings documented in the ED discharge report are due to misinterpretations by the junior radiologist, the discharging EPs or failed communication between the radiologist and the EP. However, regardless of where the error originates, it is the differences pragmatically assessed in this study that arguably matter most to the patient. Future studies focusing on collaboration in healthcare are needed30 because failed teamwork has been repeatedly identified as an important source of diagnostic error.6 Third, due to the retrospective nature of this study, we are unable to determine if and how the identified discrepancies were acted on. Future prospective investigations should include a follow-up on diagnostic discrepancies. Fourth, the study is a single-centre cohort study. Results may vary between centres and levels of care.

Last, one obvious question is why the estimates of error with and without consequence vary by an order of magnitude from author to author. We would offer two potential explanations. First, the definition of what constitutes a diagnostic error in general31 and a clinically significant difference in radiological diagnosis specifically is highly variable between publications, potentially resulting in different estimates. Second, due to time constraints, EPs may tend to only report findings they deem significant, which may explain the comparatively large number of insignificant differences found in our study.

In conclusion, we found a comparatively large number of discrepancies between radiological findings in patients discharge documentation compared with the final radiological report and identified age, imaging modality and body parts under investigation to be predictive of such discrepancies. All three predictors are readily available in clinical practice and should prompt EPs to reconsider their discharge diagnosis especially in older patients undergoing CT scans of more than one anatomic region.


The authors would like to thank Sabina Utiger for her support in data management.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
View Abstract


  • Contributors Study design: BM, DE, AKE, LM, WEH. Data collection: BM, DE, LM. Expert raters: LM, AKE. Statistical analysis: WEH. Interpretation of results: BM, DE, AKE, LM, WEH. Drafting the manuscript: BM, DE, WEH. Substantial contribution to the manuscript for important intellectual content: BM, DE, AKE, LM, WEH.

  • Funding This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests WEH received financial compensation for educational consultancies from the AO foundation, Zurich and speakers honorarium from Mundipharma Medical, Basel, Switzerland.

  • Patient consent Detail has been removed from this case description/these case descriptions to ensure anonymity. The editors and reviewers have seen the detailed information available and are satisfied that the information backs up the case the authors are making.

  • Ethics approval Kantonale Ethikkomission Bern.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Original data are available from the corresponding author on request, provided the requesting person or party fulfills the data sharing requirements laid out by Kantonale Ethikkomission Bern.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.