Article Text

Representativeness and optimal use of body mass index (BMI) in the UK Clinical Practice Research Datalink (CPRD)
  1. Krishnan Bhaskaran,
  2. Harriet J Forbes,
  3. Ian Douglas,
  4. David A Leon,
  5. Liam Smeeth
  1. Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK
  1. Correspondence to Dr Krishnan Bhaskaran; krishnan.bhaskaran{at}lshtm.ac.uk

Abstract

Objectives To assess the completeness and representativeness of body mass index (BMI) data in the Clinical Practice Research Datalink (CPRD), and determine an optimal strategy for their use.

Design Descriptive study.

Setting Electronic healthcare records from primary care.

Participants A million patient random sample from the UK CPRD primary care database, aged ≥16 years.

Primary and secondary outcome measures BMI completeness in CPRD was evaluated by age, sex and calendar period. CPRD-based summary BMI statistics for each calendar year (2003–2010) were age-standardised and sex-standardised and compared with equivalent statistics from the Health Survey for England (HSE).

Results BMI completeness increased over calendar time from 37% in 1990–1994 to 77% in 2005–2011, was higher among females and increased with age. When BMI at specific time points was assigned based on the most recent record, calendar–year-specific mean BMI statistics underestimated equivalent HSE statistics by 0.75–1.1 kg/m2. Restriction to those with a recent (≤3 years) BMI resulted in mean BMI estimates closer to HSE (≤0.28 kg/m2 underestimation), but excluded up to 47% of patients. An alternative strategy of imputing up-to-date BMI based on modelled changes in BMI over time since the last available record also led to mean BMI estimates that were close to HSE (≤0.37 kg/m2 underestimation).

Conclusions Completeness of BMI in CPRD increased over time and varied by age and sex. At a given point in time, a large proportion of the most recent BMIs are unlikely to reflect current BMI; consequent BMI misclassification might be reduced by employing model-based imputation of current BMI.

  • Epidemiology
  • Primary Care
  • Statistics & Research Methods

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement: