Article Text

Download PDFPDF

Is it possible to automatically assess pretreatment digital rectal examination documentation using natural language processing? A single-centre retrospective study
  1. Selen Bozkurt1,2,
  2. Kathleen M Kan3,
  3. Michelle K Ferrari4,
  4. Daniel L Rubin1,5,
  5. Douglas W Blayney6,
  6. Tina Hernandez-Boussard1,2,7,
  7. James D Brooks4
  1. 1Biomedical Data Science, Stanford University, Stanford, CA, USA
  2. 2Medicine (Biomedical Informatics), Stanford University, Stanford, CA, USA
  3. 3Urology, Stanford Lucile Salter Packard Children’s Hospital, Stanford, CA, USA
  4. 4Urology, Stanford University, Stanford, CA, USA
  5. 5Radiology, Stanford University, Stanford, CA, USA
  6. 6Medicine (Oncology), Stanford University, Stanford, CA, USA
  7. 7Surgery, Stanford University, Stanford, CA, USA
  1. Correspondence to Dr James D Brooks; jdbrooks{at}


Objectives To develop and test a method for automatic assessment of a quality metric, provider-documented pretreatment digital rectal examination (DRE), using the outputs of a natural language processing (NLP) framework.

Setting An electronic health records (EHR)-based prostate cancer data warehouse was used to identify patients and associated clinical notes from 1 January 2005 to 31 December 2017. Using a previously developed natural language processing pipeline, we classified DRE assessment as documented (currently or historically performed), deferred (or suggested as a future examination) and refused.

Primary and secondary outcome measures We investigated the quality metric performance, documentation 6 months before treatment and identified patient and clinical factors associated with metric performance.

Results The cohort included 7215 patients with prostate cancer and 426 227 unique clinical notes associated with pretreatment encounters. DREs of 5958 (82.6%) patients were documented and 1257 (17.4%) of patients did not have a DRE documented in the EHR. A total of 3742 (51.9%) patient DREs were documented within 6 months prior to treatment, meeting the quality metric. Patients with private insurance had a higher rate of DRE 6 months prior to starting treatment as compared with Medicaid-based or Medicare-based payors (77.3%vs69.5%, p=0.001). Patients undergoing chemotherapy, radiation therapy or surgery as the first line of treatment were more likely to have a documented DRE 6 months prior to treatment.

Conclusion EHRs contain valuable unstructured information and with NLP, it is feasible to accurately and efficiently identify quality metrics with current documentation clinician workflow.

  • digital rectal examination
  • electronic health records
  • quality metrics
  • prostate cancer
  • natural language processing

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

View Full Text

Statistics from


  • Contributors JDB and THB conceived and directed the project. JDB, DWB, DLR and KMK chose the variables of interest, decided on the inclusion and exclusion criteria for participation in the study and THB and SB collected the data. MKF and KMK created the manual chart reviews as a validation set. SB developed the NLP pipeline. SB and THB analysed and evaluated the data and had full access to all of the data in the study and takes responsibility for both the integrity of the data and the accuracy of the data analysis. SB and KMK wrote the paper and all authors reviewed and approved the manuscript.

  • Funding Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under Award Number R01CA183962. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

  • Disclaimer The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

  • Competing interests None declared.

  • Ethics approval The Stanford University Institutional Review Board (IRB) approved human subjects’ involvement for this research project; the requirement for written informed consent was waived by the IRB.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement The authors agree to share extracted data stripped of all patient identifiers. Primary access to the Electronic Health Record, where the data from this article were extracted, will not be allowed in accord with the US Federal statutes.

  • Patient consent for publication Not required.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.