Article Text

Download PDFPDF

Predicting asthma attacks in primary care: protocol for developing a machine learning-based prediction model
  1. Holly Tibble1,2,
  2. Athanasios Tsanas1,2,
  3. Elsie Horne1,2,
  4. Robert Horne2,3,
  5. Mehrdad Mizani1,2,
  6. Colin R Simpson2,4,
  7. Aziz Sheikh1,2
  1. 1Usher Institute of Population Health Sciences and Informatics, Edinburgh Medical School, College of Medicine and Veterinary Medicine, University of Edinburgh, Edinburgh, UK
  2. 2Asthma UK Centre for Applied Research, Edinburgh, UK
  3. 3University College London, London, UK
  4. 4School of Health, Victoria University of Wellington, Wellington, UK
  1. Correspondence to Holly Tibble; holly.tibble{at}ed.ac.uk

Abstract

Introduction Asthma is a long-term condition with rapid onset worsening of symptoms (‘attacks’) which can be unpredictable and may prove fatal. Models predicting asthma attacks require high sensitivity to minimise mortality risk, and high specificity to avoid unnecessary prescribing of preventative medications that carry an associated risk of adverse events. We aim to create a risk score to predict asthma attacks in primary care using a statistical learning approach trained on routinely collected electronic health record data.

Methods and analysis We will employ machine-learning classifiers (naïve Bayes, support vector machines, and random forests) to create an asthma attack risk prediction model, using the Asthma Learning Health System (ALHS) study patient registry comprising 500 000 individuals across 75 Scottish general practices, with linked longitudinal primary care prescribing records, primary care Read codes, accident and emergency records, hospital admissions and deaths. Models will be compared on a partition of the dataset reserved for validation, and the final model will be tested in both an unseen partition of the derivation dataset and an external dataset from the Seasonal Influenza Vaccination Effectiveness II (SIVE II) study.

Ethics and dissemination Permissions for the ALHS project were obtained from the South East Scotland Research Ethics Committee 02 [16/SS/0130] and the Public Benefit and Privacy Panel for Health and Social Care (1516–0489). Permissions for the SIVE II project were obtained from the Privacy Advisory Committee (National Services NHS Scotland) [68/14] and the National Research Ethics Committee West Midlands–Edgbaston [15/WM/0035]. The subsequent research paper will be submitted for publication to a peer-reviewed journal and code scripts used for all components of the data cleaning, compiling, and analysis will be made available in the open source GitHub website (https://github.com/hollytibble).

  • asthma
  • primary care
  • asthma attacks
  • machine learning
  • prediction

This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.

View Full Text

Statistics from Altmetric.com

Footnotes

  • Contributors HT and AT conceived and planned the analysis. HT and RH specified the medication adherence measures. HT, EH, CS, MM, and AS constructed the covariate (and associated Read Coding) lists for the model. HT wrote the first draft, with contributions from all authors. All authors (HT, AT, EH, RH, MM, CRS, and AS) approved the final version and jointly take responsibility for the decision to submit this manuscript to be considered for publication.

  • Funding HT is supported by College of Medicine and Veterinary Medicine PhD (eHERC/Farr Institute) Studentships from The University of Edinburgh. EH is supported by a Medical Research Council PhD Studentship (eHERC/Farr). MAM’s Newton International Fellowship is awarded by the Academy of Medical Sciences and Newton Fund. This work is carried out with the support of the Asthma UK Centre for Applied Research [AUK-AC-2012-01] and Health Data Research UK, an initiative funded by UK Research and Innovation Councils, National Institute for Health Research (England) and the UK devolved administrations, and leading medical research charities. The ALHS dataset was created with funding from the National Environment Research Council [NE/P011012/1]. The SIVE II dataset was created with funding from the National Institute for Health Research (NIHR) Health Technology Assessment programme [13/34/14]—the views and opinions expressed therein are those of the authors and do not necessarily reflect those of the Health Technology Assessment programme, NIHR, NHS, or the Department of Health.

  • Competing interests None declared.

  • Ethics approval Permissions for the ALHS project were obtained from the South East Scotland Research Ethics Committee 02 [16 /SS/0130] and the Public Benefit and Privacy Panel for Health and Social Care (1516 – 0489) . Permissions for the SIVE II project were obtained from the Privacy Advisory Committee (National Services NHS Scotland) [68/14] and the National Research Ethics Committee West Midlands - Edgbaston [15/WM/0035].

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Patient consent for publication Not required.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.