Article Text

Download PDFPDF

Quality of recording of diabetes in the UK: how does the GP's method of coding clinical data affect incidence estimates? Cross-sectional study using the CPRD database
  1. A Rosemary Tate1,
  2. Sheena Dungey1,2,
  3. Simon Glew3,
  4. Natalia Beloff1,
  5. Rachael Williams2,
  6. Tim Williams2
  1. 1Department of Informatics, University of Sussex, Brighton, UK
  2. 2CPRD, MHRA, London, UK
  3. 3Division of Primary Care and Public Health, Brighton and Sussex Medical School, Brighton, UK
  1. Correspondence to Dr A Rosemary Tate; rosemary{at}


Objective To assess the effect of coding quality on estimates of the incidence of diabetes in the UK between 1995 and 2014.

Design A cross-sectional analysis examining diabetes coding from 1995 to 2014 and how the choice of codes (diagnosis codes vs codes which suggest diagnosis) and quality of coding affect estimated incidence.

Setting Routine primary care data from 684 practices contributing to the UK Clinical Practice Research Datalink (data contributed from Vision (INPS) practices).

Main outcome measure Incidence rates of diabetes and how they are affected by (1) GP coding and (2) excluding ‘poor’ quality practices with at least 10% incident patients inaccurately coded between 2004 and 2014.

Results Incidence rates and accuracy of coding varied widely between practices and the trends differed according to selected category of code. If diagnosis codes were used, the incidence of type 2 increased sharply until 2004 (when the UK Quality Outcomes Framework was introduced), and then flattened off, until 2009, after which they decreased. If non-diagnosis codes were included, the numbers continued to increase until 2012. Although coding quality improved over time, 15% of the 666 practices that contributed data between 2004 and 2014 were labelled ‘poor’ quality. When these practices were dropped from the analyses, the downward trend in the incidence of type 2 after 2009 became less marked and incidence rates were higher.

Conclusions In contrast to some previous reports, diabetes incidence (based on diagnostic codes) appears not to have increased since 2004 in the UK. Choice of codes can make a significant difference to incidence estimates, as can quality of recording. Codes and data quality should be checked when assessing incidence rates using GP data.

  • Data quality
  • Misclassification

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from


  • Contributors ART conceived and designed the research project with input from TW, RW and NB. ART and SD carried out the analysis. SG provided clinical input and advised on the codelists and categorisation of codes. ART and RW provided statistical expertise. ART, SG and SD drafted parts of the article, and all authors revised it critically for important intellectual content.

  • Funding The authors would like to acknowledge the financial support of the Technology Strategy Board ‘Harnessing large and diverse sources of data’. Project Number 100926. ART and NB received financial support, in the form of a fellowship, from the Medicines and Healthcare products Regulatory Agency.

  • Competing interests None declared.

  • Ethics approval We used a fully anonymised data set from the General Practice Research Database. We did not obtain participant's consent because the participant data were taken from the fully anonymised data set and no participant's identity details were revealed. There was no need for participant consent. The study was approved by the Independent Scientific Advisory Committee (ISAC) of the Medicines and Healthcare products Regulatory Agency (MHRA) (protocol 15_010R entitled ‘Diabetes incidence in the UK from 1995 to 2014: how does quality of GP recording affect the estimates?’).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Owing to ethical restrictions, data are available from the Clinical Practice Research Datalink from—the CPRD knowledge centre or from the coauthors that are affiliated with the CPRD. Anyone who would like to use CPRD data will need to first submit an application to the Independent Scientific Advisory Committee (ISAC) of the Medicines and Healthcare products Regulatory Agency (MHRA)

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.