Article Text

Original research
Early identification of persistent somatic symptoms in primary care: data-driven and theory-driven predictive modelling based on electronic medical records of Dutch general practices
  1. Willeke M Kitselaar1,2,
  2. Frederike L Büchner1,
  3. Rosalie van der Vaart2,
  4. Stephen P Sutch1,3,
  5. Frank C Bennis4,5,
  6. Andrea WM Evers2,
  7. Mattijs E Numans1
  1. 1 Health Campus The Hague/Department of Public Health and Primary Care, Leiden University Medical Center, The Hague, The Netherlands
  2. 2 Health, Medical and Neuropsychology unit, Department of Psychology, Leiden University, Leiden, Netherlands
  3. 3 HSR, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
  4. 4 Computer Science, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
  5. 5 Netherlands Institute for Health Services Research, Utrecht, Netherlands
  1. Correspondence to Willeke M Kitselaar; w.m.kitselaar{at}vu.nl

Abstract

Objective The present study aimed to early identify patients with persistent somatic symptoms (PSS) in primary care by exploring routine care data-based approaches.

Design/setting A cohort study based on routine primary care data from 76 general practices in the Netherlands was executed for predictive modelling.

Participants Inclusion of 94 440 adult patients was based on: at least 7-year general practice enrolment, having more than one symptom/disease registration and >10 consultations.

Methods Cases were selected based on the first PSS registration in 2017–2018. Candidate predictors were selected 2–5 years prior to PSS and categorised into data-driven approaches: symptoms/diseases, medications, referrals, sequential patterns and changing lab results; and theory-driven approaches: constructed factors based on literature and terminology in free text. Of these, 12 candidate predictor categories were formed and used to develop prediction models by cross-validated least absolute shrinkage and selection operator regression on 80% of the dataset. Derived models were internally validated on the remaining 20% of the dataset.

Results All models had comparable predictive values (area under the receiver operating characteristic curves=0.70 to 0.72). Predictors are related to genital complaints, specific symptoms (eg, digestive, fatigue and mood), healthcare utilisation, and number of complaints. Most fruitful predictor categories are literature-based and medications. Predictors often had overlapping constructs, such as digestive symptoms (symptom/disease codes) and drugs for anti-constipation (medication codes), indicating that registration is inconsistent between general practitioners (GPs).

Conclusions The findings indicate low to moderate diagnostic accuracy for early identification of PSS based on routine primary care data. Nonetheless, simple clinical decision rules based on structured symptom/disease or medication codes could possibly be an efficient way to support GPs in identifying patients at risk of PSS. A full data-based prediction currently appears to be hampered by inconsistent and missing registrations. Future research on predictive modelling of PSS using routine care data should focus on data enrichment or free-text mining to overcome inconsistent registrations and improve predictive accuracy.

  • PRIMARY CARE
  • STATISTICS & RESEARCH METHODS
  • MENTAL HEALTH

Data availability statement

Data are available upon reasonable request. No additional raw data are available. Processed data can be made available to researchers upon reasonable request.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Data availability statement

Data are available upon reasonable request. No additional raw data are available. Processed data can be made available to researchers upon reasonable request.

View Full Text

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Twitter @stevesutch

  • Contributors The study was primarily designed by WMK, RvdV, AWE and MEN. WMK conducted the study under the guidance of all other authors. WMK pre-processed the data. WMK analysed the data under the guidance of FCB, SPS and FLB. WMK drafted the manuscript. FLB, RvdV, FCB and SS reviewed and provided critical comments on all early-stage drafts of the manuscript. AWE and MEN reviewed and the provided critical comments on drafts of the full manuscript. All authors approved the submitted version. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted. MEN is the guarantor of this study.

  • Funding WMK’s PhD project was internally funded by Leiden University (a profile area) and Leiden University Medical Center. No external funding supported this study.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were involved in the design, or conduct, or reporting, or dissemination plans of this research. Refer to the Methods section for further details.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.