Article Text

Download PDFPDF

Enhancing risk stratification for use in integrated care: a cluster analysis of high-risk patients in a retrospective cohort study
  1. Sabine I Vuik1,
  2. Erik Mayer2,
  3. Ara Darzi1,2
  1. 1Institute of Global Health Innovation, Imperial College, St Mary's Hospital, London, UK
  2. 2Department of Surgery, Imperial College, St Mary's Hospital, London, UK
  1. Correspondence to Sabine I Vuik; s.vuik{at}


Objective To show how segmentation can enhance risk stratification tools for integrated care, by providing insight into different care usage patterns within the high-risk population.

Design A retrospective cohort study. A risk score was calculated for each person using a logistic regression, which was then used to select the top 5% high-risk individuals. This population was segmented based on the usage of different care settings using a k-means cluster analysis. Data from 2008 to 2011 were used to create the risk score and segments, while 2012 data were used to understand the predictive abilities of the models.

Setting and participants Data were collected from administrative data sets covering primary and secondary care for a random sample of 300 000 English patients.

Main measures The high-risk population was segmented based on their usage of 4 different care settings: emergency acute care, elective acute care, outpatient care and GP care.

Results While the risk strata predicted care usage at a high level, within the high-risk population, usage varied significantly. 4 different groups of high-risk patients could be identified. These 4 segments had distinct usage patterns across care settings, reflecting different levels and types of care needs. The 2008–2011 usage patterns of the 4 segments were consistent with the 2012 patterns.

Discussion Cluster analyses revealed that the high-risk population is not homogeneous, as there exist 4 groups of patients with different needs across the care continuum. Since the patterns were predictive of future care use, they can be used to develop integrated care programmes tailored to these different groups.

Conclusions Usage-based segmentation augments risk stratification by identifying patient groups with different care needs, around which integrated care programmes can be designed.


This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Twitter Follow Sabine Vuik @sabinevuik

  • Database This study is based on data from the Clinical Practice Research Datalink obtained under licence from the UK Medicines and Healthcare Products Regulatory Agency. However, the interpretation and conclusions contained in the study are those of the authors alone.

  • Contributors SIV designed the study, created the database, analysed the data and drafted and revised the paper. She is the guarantor. EM contributed to the design of the study, analysed the results and revised the draft paper. AD contributed to the design of the study and revised the draft paper. All have approved the final version for publication.

  • Funding This study was partially funded by the Sowerby eHealth Forum, sponsored by the Peter Sowerby Foundation.

  • Disclaimer The funder had no role in the study design or analysis, or in the drafting and submission of this paper. The researchers worked independent from the funders.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Technical appendix available in online supplementary files and statistical code are available from the corresponding author.