Article Text

Download PDFPDF

Validation of asthma recording in the Clinical Practice Research Datalink (CPRD)
  1. Francis Nissen1,
  2. Daniel R Morales2,
  3. Hana Mullerova3,
  4. Liam Smeeth1,
  5. Ian J Douglas1,
  6. Jennifer K Quint4
  1. 1 Department of Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, UK
  2. 2 Division of Population Health Sciences, University of Dundee, Dundee, UK
  3. 3 RWE & Epidemiology, GSK R&D, Uxbridge, UK
  4. 4 National Heart and Lung Institute, Imperial College, London, UK
  1. Correspondence to Dr Francis Nissen; francis.nissen{at}


Objectives The optimal method of identifying people with asthma from electronic health records in primary care is not known. The aim of this study is to determine the positive predictive value (PPV) of different algorithms using clinical codes and prescription data to identify people with asthma in the United Kingdom Clinical Practice Research Datalink (CPRD).

Methods 684 participants registered with a general practitioner (GP) practice contributing to CPRD between 1 December 2013 and 30 November 2015 were selected according to one of eight predefined potential asthma identification algorithms. A questionnaire was sent to the GPs to confirm asthma status and provide additional information to support an asthma diagnosis. Two study physicians independently reviewed and adjudicated the questionnaires and additional information to form a gold standard for asthma diagnosis. The PPV was calculated for each algorithm.

Results 684 questionnaires were sent, of which 494 (72%) were returned and 475 (69%) were complete and analysed. All five algorithms including a specific Read code indicating asthma or non-specific Read code accompanied by additional conditions performed well. The PPV for asthma diagnosis using only a specific asthma code was 86.4% (95% CI 77.4% to 95.4%). Extra information on asthma medication prescription (PPV 83.3%), evidence of reversibility testing (PPV 86.0%) or a combination of all three selection criteria (PPV 86.4%) did not result in a higher PPV. The algorithm using non-specific asthma codes, information on reversibility testing and respiratory medication use scored highest (PPV 90.7%, 95% CI (82.8% to 98.7%), but had a much lower identifiable population. Algorithms based on asthma symptom codes had low PPVs (43.1% to 57.8%)%).

Conclusions People with asthma can be accurately identified from UK primary care records using specific Read codes. The inclusion of spirometry or asthma medications in the algorithm did not clearly improve accuracy.

Ethics and dissemination The protocol for this research was approved by the Independent Scientific Advisory Committee (ISAC) for MHRA Database Research (protocol number15_257) and the approved protocol was made available to the journal and reviewers during peer review. Generic ethical approval for observational research using the CPRD with approval from ISAC has been granted by a Health Research Authority Research Ethics Committee (East Midlands—Derby, REC reference number 05/MRE04/87).

The results will be submitted for publication and will be disseminated through research conferences and peer-reviewed journals.

  • asthma
  • validation
  • electronic health records
  • positive predictive value
  • epidemiology

This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See:

View Full Text

Statistics from


  • Contributors JKQ, IJD, LS and HM were responsible for developing the research question and have advised on the data collection and search strategies. FN summarised and analysed the questionnaires and drafted the manuscript. JKQ and DM reviewed the questionnaires and constructed the gold standard for asthma validation. JKQ is responsible for study management and coordination. All authors have read, commented on and approved the final manuscript.

  • Funding This work was supported by GlaxoSmithKline (GSK), through a PhD scholarship for FN with grant number EPNCZF5310. The publishing of this study was supported by the Wellcome Trust: grant number 098504/Z/12/Z.

  • Competing interests FN is funded by a GSK scholarship during his PhD programme. IJD is funded by an unrestricted grant from, has consulted for and holds stock in GlaxoSmithKline. HM is an employee of GSK R&D and own shares of GSK Plc. JKQ reports grants from MRC, BLF, Wellcome Trust and has received research funds from GSK, AZ, Quintiles IMS, in addition to personal fees from AZ, Chiesi, BI .

  • Patient consent This is a study using electronic health records; individual-level permission was given at the time of data collection. It does not need to be repeated for each study. All information is anonymised by CPRD.

  • Ethics approval Ethics approval was obtained from ISAC (the Independent Scientific Advisory Committee overseeing CPRD), protocol 15_257.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Study data will be available on request to FN once the research team has completed preplanned analyses.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.