Article Text

Download PDFPDF

Childhood respiratory illness presentation and service utilisation in primary care: a six-year cohort study in Wellington, New Zealand, using natural language processing (NLP) software
  1. Anthony Dowell1,
  2. Ben Darlow1,
  3. Jayden Macrae2,
  4. Maria Stubbe1,
  5. Nikki Turner3,
  6. Lynn McBain1
  1. 1 Department of Primary Health Care and General Practice, University of Otago, Wellington, New Zealand
  2. 2 Datacraft Analytics, Wellington, New Zealand
  3. 3 Department of General Practice and Primary Health Care, University of Auckland, Wellington, New Zealand
  1. Correspondence to Professor Anthony Dowell; tony.dowell{at}


Objectives To identify childhood respiratory tract-related illness presentation rates and service utilisation in primary care by interrogating free text and coded data from electronic medical records.

Design Retrospective cohort study. Data interrogation used a natural language processing software inference algorithm.

Setting 36 primary care practices in New Zealand. Data analysed from January 2008 to December 2013.

Participants The records from 77 582 children enrolled were reviewed over a 6-year period to estimate the presentation of childhood respiratory illness and service utilisation. This cohort represents 268 919 person-years of data and over 650 000 unique consultations.

Main outcome measure Childhood respiratory illness presentation rate to primary care practice, with description of seasonal and yearly variation.

Results Respiratory conditions constituted 46% of all child-general practitioner consultations with a stable year-on-year pattern of seasonal peaks. Upper respiratory tract infection was the most common respiratory category accounting for 21.0% of all childhood consultations, followed by otitis media (12.2%), wheeze-related illness (9.7%), throat infection (7.4%) and lower respiratory tract infection (4.4%). Almost 70% of children presented to their general practitioner with at least one respiratory condition in their first year of life; this reduced to approximately 25% for children aged 10–17.

Conclusion This is the first study to assess the primary care incidence and service utilisation of childhood respiratory illness in a large primary care cohort by interrogating electronic medical record free text. The study identified the very high primary care workload related to childhood respiratory illness, especially during the first 2 years of life. These data can enable more effective planning of health service delivery. The findings and methodology have relevance to many countries, and the use of primary care ‘big data’ in this way can be applied to other health conditions.

  • primary care
  • general practice
  • childhood respiratory illness
  • natural language software programming
  • big data

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

View Full Text

Statistics from


  • Contributors AD, JM, MS, LM and NT conceived the study. All authors contributed to the development of the overall study methodology. AD, LM and NT provided clinical input into the algorithm design. JM designed and built the natural language processing tools. JM programmed and trained the algorithm. AD and LM classified the consultation records in the gold standard sets. BD and AD were the principal writers of the manuscript. All authors reviewed and revised the manuscript and approved its final version. All authors had full access to all of the data (including statistical reports and tables) in the study and can take responsibility for the integrity of the data and the accuracy of the data analysis.

  • Funding This work was supported by a New Zealand Lotteries Health Research Grant. The funding body had no role in the collection or analysis of data or the preparation of this manuscript.

  • Competing interests All authors have completed the Unified Competing Interest form at (available upon request from the corresponding author) and declare support from New Zealand Lotteries Health Research for the submitted work. LM is a director of Compass Health Wellington Trust that might have an interest in the submitted work. No other relationships or activities could appear to have influenced the submitted work.

  • Patient consent No identifiable patient data used in the output of the study. Patient consent for collection and utilisation of non-identifiable data is contained within the enrolment forms collected by General Practices when patients enrol with their GP.

  • Ethics approval This study was approved by the University of Otago Ethics Committee (H13/044).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement No additional data are available.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.