Article Text

Download PDFPDF

Reliability and accuracy of delirium assessments among investigators at multiple international centres
  1. Hannah R Maybrier1,
  2. Angela M Mickle1,
  3. Krisztina E Escallier1,
  4. Nan Lin2,3,
  5. Eva M Schmitt4,
  6. Ravi T Upadhyayula1,
  7. Troy S Wildes1,
  8. George A Mashour5,
  9. Kerry Palihnich4,
  10. Sharon K Inouye4,6,
  11. Michael Simon Avidan1
  12. on behalf of the PODCAST Research Group
    1. 1 Department of Anesthesiology, Washington University in Saint Louis School of Medicine, Saint Louis, Missouri, USA
    2. 2 Department of Mathematics, Washington University in Saint Louis, St. Louis, Missouri, USA
    3. 3 Division of Biostatistics, Washington University School of Medicine, St. Louis, Missouri, USA
    4. 4 Aging Brain Center, Institute for Aging Research, Hebrew SeniorLife, Boston, Massachusetts, USA
    5. 5 Department of Anesthesiology, University of Michigan, Ann Arbor, Michigan, USA
    6. 6 Department of Medicine, Beth Israel Deaconess Medical Center, Hebrew Senior Life, Harvard Medical School, Boston, Massachusetts, USA
    1. Correspondence to Dr Michael Simon Avidan; avidanm{at}


    Introduction Delirium is a common, serious postoperative complication. For clinical studies to generate valid findings, delirium assessments must be standardised and administered accurately by independent researchers. The Confusion Assessment Method (CAM) is a widely used delirium assessment tool. The objective was to determine whether implementing a standardised CAM training protocol for researchers at multiple international sites yields reliable inter-rater assessment and accurate delirium diagnosis.

    Methods Patients consented to video recordings of CAM delirium assessments for research purposes. Raters underwent structured training in CAM administration. Training entailed didactic education, role-playing with intensive feedback, apprenticeship with experienced researchers and group discussions of complex cases. Raters independently viewed and scored nine video-recorded CAM interviews. Inter-rater reliability was determined using Fleiss kappa. Accuracy was judged by comparing raters’ scores with those of an expert delirium researcher.

    Results Twenty-seven raters from eight international research centres completed the study and achieved almost perfect agreement for overall delirium diagnosis, kappa=0.88 (95% CI 0.85 to 0.92). Agreement of the four core CAM features ranged from fair to substantial. The sensitivity and specificity for identifying delirium were 72% (95% CI 60% to 81%) and 99% (95% CI 96% to 100%), considering an expert rater’s scores as the reference standard (delirious, n=3; non-delirious, n=6). Delirium severity ratings were tightly clustered, with most scores within 5% of the median.

    Conclusion Our results demonstrate that, with appropriate training and ongoing scoring discussions, researchers at multiple sites can reliably detect delirium in postsurgical patients. These results support the premise that methodologically rigorous multi-centre studies can yield standardised and accurate determinations of delirium.

    • adult anaesthesia
    • surgery
    • delirium
    • confusion
    • anesthesia

    This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

    View Full Text

    Statistics from


    • SKI and MSA contributed equally.

    • Contributors HRM contributed by writing and editing the manuscript, managing the electronic database, coordinating delirium assessment training and conducting patient interviews. AMM contributed by editing the manuscript, managing the electronic database and editing patient interviews. KEE contributed by conceptualising study design. NL contributed by performing statistical analyses. EMS, KP and SKI contributed by advising delirium assessment training. EMS and SKI also contributed by editing the manuscript. KP also served as the expert rater. RTU contributed by editing the manuscript and conducting patient interviews. TSW and GAM contributed by conceptualisng the study design and editing the manuscript. MSA contributed by conceptualising the study design, composing and editing the manuscript and overseeing delirium assessment training.

    • Funding This study was funded by the National Institutes of Health (NIDUS Grant: NIA R24AG054259, and grant T32GM103730) and the NIH/NCI Cancer Center Support Grant (P30 CA008748).

    • Competing interests None declared.

    • Patient consent Not required.

    • Ethics approval Washington University Institutional Review Board.

    • Provenance and peer review Not commissioned; externally peer reviewed.

    • Data sharing statement The data set used for this study can be made available upon request.

    • Collaborators The PODCAST Research Group includes: Apakama GP, Aquino K, Arya VK, Avidan MS, Ben Abdallah A, Chen Y, Dicks R, Downey RJ, Emmert DA, Escallier K, Fardous HA, Fritz BA, Funk DJ, Galati J, Gipson KE, Girardi L, Graetz TJ, Grocott H, Gruber AT, Hicks M, Hudetz JA, Inouye SK, Ivascu NS, Jacobsohn E, Jayant A , Kashani HH, Kavosh MS, Kunkler BS, Lee YH, Lenze E, Mashour GA, Maybrier HR, McKinney AS, McKinnon SL, Mickle AM, Monterola M, Muench MR, Murphy MR, Noh GJ, Pagel PS, Pryor KO, Redko M, Richards T, Rogers EM, Schmitt E, Sivanesan L, Steinkamp ML, Tellor B, Thomas S, Torres BA, Upadhyayula R, Veselis RA, Vlisides PE, Waszynski C, Wildes TS, Veltri C, Yulico H.

    Request Permissions

    If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.