Accessing primary care Big Data: the development of a software algorithm to explore the rich content of consultation records

J MacRae; B Darlow; L McBain; O Jones; M Stubbe; N Turner; A Dowell

doi:10.1136/bmjopen-2015-008160

Article Text

PDF

XML

Health informatics

Research

Accessing primary care Big Data: the development of a software algorithm to explore the rich content of consultation records

J MacRae1,
B Darlow2,
L McBain2,
O Jones3,
M Stubbe2,
N Turner4,
A Dowell2

¹Patients First, Wellington, New Zealand
²Department of Primary Health Care and General Practice, University of Otago, Wellington, New Zealand
³Compass Health Wellington Trust, Wellington, New Zealand
⁴Department of General Practice and Primary Care, University of Auckland, Auckland, New Zealand

Correspondence to Professor A Dowell; tony.dowell{at}otago.ac.nz

Abstract

Objective To develop a natural language processing software inference algorithm to classify the content of primary care consultations using electronic health record Big Data and subsequently test the algorithm's ability to estimate the prevalence and burden of childhood respiratory illness in primary care.

Design Algorithm development and validation study. To classify consultations, the algorithm is designed to interrogate clinical narrative entered as free text, diagnostic (Read) codes created and medications prescribed on the day of the consultation.

Setting Thirty-six consenting primary care practices from a mixed urban and semirural region of New Zealand. Three independent sets of 1200 child consultation records were randomly extracted from a data set of all general practitioner consultations in participating practices between 1 January 2008–31 December 2013 for children under 18 years of age (n=754 242). Each consultation record within these sets was independently classified by two expert clinicians as respiratory or non-respiratory, and subclassified according to respiratory diagnostic categories to create three ‘gold standard’ sets of classified records. These three gold standard record sets were used to train, test and validate the algorithm.

Outcome measures Sensitivity, specificity, positive predictive value and F-measure were calculated to illustrate the algorithm's ability to replicate judgements of expert clinicians within the 1200 record gold standard validation set.

Results The algorithm was able to identify respiratory consultations in the 1200 record validation set with a sensitivity of 0.72 (95% CI 0.67 to 0.78) and a specificity of 0.95 (95% CI 0.93 to 0.98). The positive predictive value of algorithm respiratory classification was 0.93 (95% CI 0.89 to 0.97). The positive predictive value of the algorithm classifying consultations as being related to specific respiratory diagnostic categories ranged from 0.68 (95% CI 0.40 to 1.00; other respiratory conditions) to 0.91 (95% CI 0.79 to 1.00; throat infections).

Conclusions A software inference algorithm that uses primary care Big Data can accurately classify the content of clinical consultations. This algorithm will enable accurate estimation of the prevalence of childhood respiratory illness in primary care and resultant service utilisation. The methodology can also be applied to other areas of clinical care.

PRIMARY CARE

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/

https://doi.org/10.1136/bmjopen-2015-008160

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

PRIMARY CARE

View Full Text

Log in using your username and password

Main menu

Log in using your username and password

You are here

Abstract

Statistics from Altmetric.com

Request Permissions

Read the full text or download the PDF:

Log in using your username and password