Bayesian clinical reasoning: does intuitive estimation of likelihood ratios on an ordinal scale outperform estimation of sensitivities and specificities?

Juan Moreira; Zeno Bisoffi; Alberto Narváez; Jef Van den Ende

doi:10.1111/j.1365-2753.2008.01003.x

Bayesian clinical reasoning: does intuitive estimation of likelihood ratios on an ordinal scale outperform estimation of sensitivities and specificities?

J Eval Clin Pract. 2008 Oct;14(5):934-40. doi: 10.1111/j.1365-2753.2008.01003.x.

Authors

Juan Moreira¹, Zeno Bisoffi, Alberto Narváez, Jef Van den Ende

Affiliation

¹ Centro de Epidemiología Comunitaria y Medicina Tropical (CECOMET), Esmeraldas, Ecuador. jmoreira@itg.be

PMID: 19018928
DOI: 10.1111/j.1365-2753.2008.01003.x

Abstract

Rationale: Bedside use of Bayes' theorem for estimating probabilities of diseases is cumbersome. An alternative approach based on five categories of powers of tests from 'useless' to 'very strong' has been proposed. The performance of clinicians using it was assessed.

Methods: Fifty clinicians attending a course of tropical medicine estimated powers of tests and post-test probabilities using the classical vs. the categorical Bayesian approach. The estimation of post-test probability was assessed for real and dummy diseases in order to avoid the bias of previous knowledge. Accuracy of answers was measured by the difference with reference values obtained from an expert system (Kabisa).

Results: Clinicians estimated positive likelihood ratios (LRs) a median of -1.07 log(10) lower than Kabisa [interquartile range (IQR): -1.47; -0.80] when derived classically and -0.17 (IQR: -0.42; +0.04) when estimated categorically (P < 0.001). For negative LRs the median was +0.39 log(10) higher (IQR: +0.71; +0.08) when derived classically and -0.18 log(10) lower (IQR: +0.03; -0.36) when estimated categorically (P < 0.001). Twenty (40%) disclosed not being able to calculate post-test probabilities using sensitivities and specificities. Regardless the approach post-test probabilities were overestimated both for real and dummy diseases [respectively +1.23 log(10) (IQR: +0,67; +2.08) and +2.03 log(10) (IQR: +0.49; +2.42)] (P = 0277), but the range was wider for the latter (P = 0.001).

Conclusions: Participants were more accurate in estimating powers with a categorical approach than with sensitivities and specificities. Post-test probabilities were overestimated with both approaches. Knowledge of the disease did not influence the estimation of post-test probabilities. A categorical approach might be an interesting instructional tool, but the effect of training with this approach needs assessment.

Publication types

Evaluation Study

MeSH terms

Adult
Africa / epidemiology
Analysis of Variance
Appendicitis / diagnosis
Appendicitis / epidemiology
Bayes Theorem*
Belgium
Clinical Competence / standards*
Diagnostic Techniques and Procedures / standards*
Epidemiology / education
Expert Systems
Female
Humans
Intuition*
Likelihood Functions*
Male
Pregnancy
Pregnancy, Ectopic / diagnosis
Pregnancy, Ectopic / epidemiology
Probability
Pulmonary Embolism / diagnosis
Pulmonary Embolism / epidemiology
Sensitivity and Specificity*
Software
Statistics, Nonparametric
Surveys and Questionnaires
Tropical Medicine / education
Tropical Medicine / methods
Tuberculosis, Pulmonary / diagnosis
Tuberculosis, Pulmonary / epidemiology