Article Text

Download PDFPDF

Original research
Validation of an algorithm to evaluate the appropriateness of outpatient antibiotic prescribing using big data of Chinese diagnosis text
  1. Houyu Zhao1,
  2. Jiaming Bian2,
  3. Li Wei3,
  4. Liuyi Li4,
  5. Yingqiu Ying5,
  6. Zeyu Zhang6,
  7. Xiaoying Yao1,
  8. Lin Zhuo7,
  9. Bin Cao6,
  10. Mei Zhang2,
  11. Siyan Zhan1
  1. 1Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
  2. 2Department of Pharmacology, Chinese PLA General Hospital, Beijing, China
  3. 3School of Pharmacy, University College London, London, UK
  4. 4Department of Healthcare-associated Infection Management and Disease Prevention and Control, Peking University First Hospital, Beijing, China
  5. 5Department of Pharmacy, Peking University Third Hospital, Beijing, China
  6. 6Department of Pulmonary and Critical Care Medicine, China-Japan Friendship Hospital, Beijing, China
  7. 7Research Center of Clinical Epidemiology, Peking University Third Hospital, Beijing, China
  1. Correspondence to Dr Siyan Zhan; siyan-zhan{at}


Objective We aimed to evaluate the validity of an algorithm to classify diagnoses according to the appropriateness of outpatient antibiotic use in the context of Chinese free text.

Setting and participants A random sample of 10 000 outpatient visits was selected between January and April 2018 from a national database for monitoring rational use of drugs, which included data from 194 secondary and tertiary hospitals in China.

Research design Diagnoses for outpatient visits were classified as tier 1 if associated with at least one condition that ‘always’ justified antibiotic use; as tier 2 if associated with at least one condition that only ‘sometimes’ justified antibiotic use but no conditions that ‘always’ justified antibiotic use; or as tier 3 if associated with only conditions that never justified antibiotic use, using a tier-fashion method and regular expression (RE)-based algorithm.

Measures Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of the classification algorithm, using classification made by chart review as the standard reference, were calculated.

Results The sensitivities of the algorithm for classifying tier 1, tier 2 and tier 3 diagnoses were 98.2% (95% CI 96.4% to 99.3%), 98.4% (95% CI 97.6% to 99.1%) and 100.0% (95% CI 100.0% to 100.0%), respectively. The specificities were 100.0% (95% CI 100.0% to 100.0%), 100.0% (95% CI 99.9% to 100.0%) and 98.6% (95% CI 97.9% to 99.1%), respectively. The PPVs for classifying tier 1, tier 2 and tier 3 diagnoses were 100.0% (95% CI 99.1% to 100.0%), 99.7% (95% CI 99.2% to 99.9%) and 99.7% (95% CI 99.6% to 99.8%), respectively. The NPVs were 99.9% (95% CI 99.8% to 100.0%), 99.8% (95% CI 99.7% to 99.9%) and 100.0% (95% CI 99.8% to 100.0%), respectively.

Conclusions The RE-based classification algorithm in the context of Chinese free text had sufficiently high validity for further evaluating the appropriateness of outpatient antibiotic prescribing.

  • antibiotics
  • prescriptions
  • validation
  • electronic health records
  • drug utilisation

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

View Full Text

Statistics from


  • Contributors HZ and SZ developed the research question. HZ conducted the data analyses and drafted the manuscript. JB and MZ collected, cleaned and managed the data. SZ, JB and MZ are responsible for the integrity of the data. SZ, JB, LW and MZ reviewed and edited the manuscript. HZ, XY, YY and ZZ established the primary list of the standard tiers of diagnoses. JB, LL and BC reviewed the list of the standard tiers of diagnoses. HZ, XY and LZ constructed the regular expressions for mapping the diagnosis text. HZ, XY and MZ reviewed and evaluate the raw prescription for appropriateness of antibiotic prescribing. All authors read and approved the final manuscript.

  • Funding This study was supported by the National Natural Science Foundation of China (Grant numbers 81973146, 81473067 and 91646107).

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Patient consent for publication Not required.

  • Ethics approval This study was approved by the Ethical Review Board of Peking University Health Science Centre (approval number: IRB00001052-18013-Exempt).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data of prescriptions are available upon reasonable request. All other data relevant to the study are included in the article or uploaded as supplementary information.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.