Article Text

Download PDFPDF

Quantification and visualisation methods of data-driven chronic care delivery pathways: protocol for a systematic review and content analysis
  1. Luiza Siqueira do Prado1,
  2. Samuel Allemann1,2,
  3. Marie Viprey1,3,
  4. Anne-Marie Schott1,3,
  5. Dan Dediu4,
  6. Alexandra L Dima1
  1. 1Health Services and Performance Research EA 7425, Université Claude Bernard Lyon 1, Lyon, France
  2. 2Pharmaceutical Care Research Group, University of Basel, Basel, Switzerland
  3. 3Pôle Santé Publique, Hospices Civils de Lyon, Lyon, France
  4. 4Laboratoire Dynamique du Langage UMR 5596, Université Lumière Lyon 2, Lyon, France
  1. Correspondence to Luiza Siqueira do Prado; luiza.siqueira-do-prado{at}


Introduction Chronic conditions require long periods of care and often involve repeated interactions with multiple healthcare providers. Faced with increasing illness burden and costs, healthcare systems are currently working towards integrated care to streamline these interactions and improve efficiency. To support this, one promising resource is the information on routine care delivery stored in various electronic healthcare databases (EHD). In chronic conditions, care delivery pathways (CDPs) can be constructed by linking multiple data sources and extracting time-stamped healthcare utilisation events and other medical data related to individual or groups of patients over specific time periods; CDPs may provide insights into current practice and ways of improving it. Several methods have been proposed in recent years to quantify and visualise CDPs. We present the protocol for a systematic review aiming to describe the content and development of CDP methods, to derive common recommendations for CDP construction.

Methods and analysis This protocol followed the Preferred Reporting Items for Systematic review and Meta-Analysis Protocols. A literature search will be performed in PubMed (MEDLINE), Scopus, IEEE, CINAHL and EMBASE, without date restrictions, to review published papers reporting data-driven chronic CDPs quantification and visualisation methods. We will describe them using several characteristics relevant for EHD use in long-term care, grouped into three domains: (1) clinical (what clinical information does the method use and how was it considered relevant?), (2) data science (what are the method’s development and implementation characteristics?) and (3) behavioural (which behaviours and interactions does the method aim to promote among users and how?). Data extraction will be performed via deductive content analysis using previously defined characteristics and accompanied by an inductive analysis to identify and code additional relevant features. Results will be presented in descriptive format and used to compare current CDPs and generate recommendations for future CDP development initiatives.

Ethics and dissemination Database searches will be initiated in May 2019. The review is expected to be completed by February 2020. Ethical approval is not required for this review. Results will be disseminated in peer-reviewed journals and conference presentations.

PROSPERO registration number CRD42019140494.

  • clinical decision support systems
  • medical informatics application
  • data visualisation
  • clinical pathway
  • delivery of health care, integrated
  • electronic healthcare databases

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • While most reviews of health technology tools focus on clinical objectives and technical characteristics, we will also consider behaviours of and interactions between users to describe the selected methods.

  • We will perform both deductive and inductive content analysis to fully describe the methods.

  • We will focus on methods described in peer-reviewed papers and exclude conference proceedings and other types of reports, to obtain detailed validated descriptions; this may limit our access to more recent studies due to the fast-paced development in the field.

  • Lack of completeness in methods descriptions may limit our ability to assess all characteristics, such as the stages of development, the involvement of stakeholders or experts prior to data acquisition and analysis.

  • As this is a relatively new field of health technology, there are no guidelines for reporting and no consensus on quality criteria for the studies we will evaluate; our work will also contribute to the development of such recommendations.


Effective delivery of integrated care is a priority for healthcare systems worldwide and has been the focus of considerable efforts in recent years, particularly in response to the increasing demands of chronic care.1 2 Long-term conditions may require lifetime care, which may consist of multiple interactions with a variety of healthcare providers at variable time intervals.3 4 When service delivery is fragmented, the overall effectiveness of these interactions in terms of long-term quality of life and health-related outcomes is reduced, and risk of harm is increased.5 6 Centralising patient information produced by different providers in electronic healthcare databases (EHD) has the potential to help implementing new ways of service delivery to improve outcomes.7 Several attempts have been made to link multiple data sources to generate comprehensive descriptions of patients’ healthcare journeys in long-term conditions. These descriptions are produced by constructing longitudinal trajectories from various time-stamped healthcare utilisation events and related medical data.8–17 For example, Zhang et al have produced longitudinal trajectories using electronic health records and cost pathways14 16 17 of people living with chronic kidney disease to inform patient engagement and to detect common pathways. Bettencourt-Silva et al have reported on the development of a patient-centric database from multiple Hospital Information Systems18 and on building data-driven pathways from routine hospital data on people living with prostate cancer to explore their potential use in biomedical research.15 However, generating these informative trajectories from disparate and often incompatible data sources proves challenging.18 19 As various initiatives have been developed independently, with distinct methodologies and objectives, it is essential to examine systematically the proposed solutions in order to derive principles of action to stimulate convergence of methods.

In the context of chronic conditions, the way patient trajectories are established may be subject to multiple influences and analysing routine care data can provide insights on how they have been drawn over time and their potential sources of variation.14 20 In the literature, trajectories within healthcare systems have been described using many terms, which makes it challenging to build consensus on terminology and practical meaning.21 22 We will use the term data-driven ‘care delivery pathway’ (CDP) to group several terms we will find in the selected studies to designate retrospective trajectories obtained from EHD. To describe the methods proposed for synthetically displaying objective measures or assessments of health status or healthcare utilisation (eg, quantifying) and graphically showing the temporal elements of chronic CDP (eg, visualising), we will assess how they addressed the following three domains.

The selection of relevant clinical and health-related events

This domain will examine how the methods define health status and evaluate disease progression or stabilisation, and how they show transitions between health status and acute manifestations.13 Usually, the trajectory timeline begins at diagnosis and involves more than one provider.13–15 Treatment decisions are generally based on health status (indicated by biomarkers, clinical examination, self-declared levels of quality of life, etc), care units and settings, treatment availability (medication, procedures, etc) and patient-provider preferences.20

The technological development itself and considering issues related to data quality and exchange

This domain aims to describe how the method is built, which data sources and analyses are used, and the necessary infrastructure surrounding its implementation. Digitalisation of health-related data is a global trend23 24 and highly detailed data are being collected daily in diverse settings and healthcare services. Such methods may apply a range of techniques from basic algorithms to advanced statistical and machine learning models,25 which can provide useful insights into care delivery processes. Technological developments in this field also need to meet strict criteria of data security, accuracy of models and predictions, openness of development and validation processes, among others.15 26

Considering behaviours of actors and interactions between them with the aim of effectively improving care delivery

Integrated care depends on multiple actions and decisions made collaboratively by patients, healthcare providers, administrative staff and other actors concerning patients’ course of treatment.27 To inform these decisions, technological solutions must have access to clinical exams and provide key actors with relevant information, such as the patients’ past interactions with other providers, the medical procedures performed, the medications prescribed.28 To have a positive impact on improving care delivery, visualisations and quantitative indicators of the patient’s prior care need to be adapted to the user’s needs at specific points in the trajectory, like after acute events or hospitalisations. This domain will examine what behaviours and interactions the methods promote (who are its target individuals, what actions need to be performed, in what context, when and by whom),29 30 and what strategies are proposed to encourage this performance.

Aims and objectives

We aim to identify and describe the methods that have been proposed to quantify and/or visualise data-driven CDPs of people living with chronic conditions. Given the complexity of their context of use, more than only reviewing technical methods, we aim to investigate how these tools have considered the three domains described above.

For this end, we propose the following research questions:

  1. What clinical information does the method use and how was it considered relevant?

  2. What are the method’s development and implementation characteristics?

  3. Which behaviours and interactions does the method aim to promote among users and how?


The Cochrane Handbook31 and the Preferred Reporting Items for Systematic review and Meta-Analysis Protocols (PRISMA-P)32 were used to write this protocol and the systematic review will follow PRISMA.33 PRISMA-P checklist is presented in online supplementary file 1. The review will be performed by one primary reviewer (LSP) and three secondary reviewers (ALD, MV and SA) and will follow six steps: literature search, records screening and pre-selection (title and abstract), full-text screening and final selection, extraction of data, quality assessment, analysis and synthesis of data.

The studies expected to be analysed in this work will likely be descriptive and not follow standard methodology (ie, experimental or observational, method validation), yet considering the manuscripts as a qualitative corpus allows for coding the narratives according to the conceptual structure we propose.7 34 Content analysis has been used in many studies in health sciences35 and an inductive content analysis applied in a systematic review of clinical information modelling processes34 has developed descriptive categories in a context similar to the one we propose here. As we consider them relevant to the studies we will review, they will be included in our coding framework, as detailed below.


A literature search will be performed in the following electronic databases: PubMed (MEDLINE), Scopus, IEEE, CINAHL and EMBASE. The search will be adapted to each database, and the resulting search strategies are provided as online supplementary file 2. The terms searched will be related to three main categories, connected by the AND operator: ‘data-driven’ (MeSH terms like ‘Electronic health record’, ‘data mining’, etc), ‘clinical pathways’ (MeSH terms like ‘clinical pathway’, ‘disease management’, etc) and ‘chronic conditions’ (MeSH term ‘chronic diseases’). Searches will be performed with MeSH terms or with keywords in Title/Abstract in PubMed; MeSH terms will be adapted for the databases that do not permit their usage or use different indexed terms. Bibliographies and citation tracking of relevant literature will be hand searched to identify additional relevant studies. A first selection will be performed using abstracts and titles, followed by full-text examination of entries selected.

Types of publications/studies and eligibility criteria

We will consider CDPs to be a series of time-stamped events describing the sequence of care of users with a diagnosed chronic condition (conditions requiring medical attention for a period longer than 12 months).36 These events can be the diagnosis itself, routine, non-scheduled or emergency consultations with a general practitioner and/or specialist, therapeutic education sessions and other health-related interventions. These can result in prescriptions of medications, medical procedures and tests, which may also appear in the trajectory. Data-driven CDP analysed here will need to be composed of at least two time-stamped events recorded in EHD from people with the diagnosis of a chronic condition, with no duration restrictions (eg, CDP may cover periods from days or few months to several years).

We will consider peer-reviewed publications (1) reporting methods for visualisation or quantification of data-driven chronic CDP (including protocols and reports of study results), (2) using data from people living with chronic conditions retrieved from EHD and (3) published in English. No restrictions on publication date, study design, population characteristics, type of healthcare facility and level of care will be applied.

We will exclude studies that aim only to assess healthcare utilisation over a specific period as part of a single research study, for example as an outcome to evaluate health-related interventions, to describe populations or disease prevalence, or as a proxy measure of disease aggravation risk. We will also exclude studies that do not mention population or data characteristics or do not state they analyse data from people living with chronic conditions, unavailable full texts, papers not written in English, conference abstracts or abstract-only papers, systematic or narrative reviews, meta-analyses and grey literature.


We will use Covidence, an online systematic review management software, for records screening. After duplicates removal, titles and abstracts in the remaining records will be screened independently by two reviewers for full text appraisal. If reviewer discordance arises, consensus will be reached through discussion and arbitration with one of the secondary reviewers not involved in the selection of the record. Studies selected in the first step will go through full text screening using the same process to establish eligibility. Inter-rater reliability (Cohen’s Kappa) between primary and secondary reviewers will be computed and reported, values greater than 0.80 will be considered adequate.

Data management

We will report the number of included and excluded articles as well as the number of full-text papers obtained and assessed. Reasons for exclusion of screened full-text studies will also be stated in the final review. The data will be managed using Covidence and Microsoft Excel spreadsheets.

Data extraction and analysis

We will use both deductive and inductive content analysis35 to appraise the selected studies: deductive when relying on pre-defined frameworks such as the categories previously described by Moreno-Conde et al34 to describe the technical characteristics of the proposed solutions and on the Action, Actor, Context, Target, Time (AACTT) framework29 30, to describe the behavioural domain and inductive when additional relevant characteristics need to be described.

Data from included studies will be extracted using a customised electronic data extraction form. Information on study characteristics (authors, title, type of study, year and country of study, objective and research questions); population characteristics (number of patients, age, gender, condition) will be extracted directly from the included studies.

Deductive–inductive content analysis

We will perform a deductive content analysis following existing theories, as described below, and inductive analysis for observed relevant characteristics not yet covered by existing literature. If more than one selected record describe development, validation and/or implementation of the same method, we will extract basic paper characteristics, as described above, but the content analysis will be performed per method.

  1. For the clinical domain, we will extract information on clinical or cost outcomes the method might target (if reported and which ones) and on how the outcomes were considered relevant (eg, involving experts, final users or other stakeholders).

  2. For method development and data processing, we will analyse and compare to what has been proposed by Moreno-Conde et al.34 The categories detailed in the study are briefly described below.

  • Scope definition leading to selection of the domain and selecting relevant experts: identifying the domain and expected uses of the method through the creation of a group of experts.

  • Analysis of the information covered in the specific domain: creation of definitions, identification of clinical scenarios, workflows, users, guidelines, literature, etc, so the method meets the requirements of clinical practice or other intended usages.

  • Design of the tool: detailing the set of attributes associated with the method, domain terminologies, ensuring compatibility across domains.

  • Definition of implementable tool specifications: description of implementable technical specification.

  • Validation: use of techniques to validate the method, such as peer-review validation or creation of prototype screens.

  • Publishing and maintenance: availability in public repositories.

  • Governance: description of the organisation responsible for developing and maintaining the tool.

Other information extracted from studies regarding this domain will be healthcare utilisation characteristics (type of event, for example, consultation, test, procedure) and data characteristics (sources of data, data preparation, data analysis).

  1. To describe behaviour and interactions the method might promote or facilitate, we will apply the AACTT29 30 framework. Other information extracted from studies will be output characteristics like intended final users, purpose and use scenarios. We will also code the presence of strategies planned or performed to achieve these behavioural change objectives, such as training, organisational changes, evaluation of the performance of the method in routine care, if implemented, and other initiatives studies might present.

The primary reviewer and one secondary reviewer will pilot data extraction independently for a subset of 10% of selected records to compare and discuss data extraction process. If necessary, we will repeat the pilot extraction process (outlined above) until agreement is reached. Inter-rater reliability (Cohen’s Kappa) will be computed, and values greater than 0.80 will be considered adequate. Disagreements will be solved with the help of a third reviewer and piloting may consist of several interactions between reviewers to compare and reach consensus regarding relevant information to be extracted from full-text analysis. After this first step, a codebook will be developed, and data extraction of the remaining records will be performed by the primary reviewer.

Quality and bias assessment

As most quality assessment tools are developed for commonly-used study designs and there is no consensus regarding tools for generic use, we propose to evaluate quality from a different perspective. We will evaluate if main stakeholders (patients and/or family, healthcare professionals, administrative personnel) were involved at any stage of the development of the method. Research shows the importance of involving patients, the public and other stakeholders in health-related research to obtain experiential knowledge, setting research priorities and focus on practical questions.37–40 Also, it has been shown that trials funded by for-profit organisations can positively bias interpretation of trial results,41 and research in data usage can be funded by companies interested in selling their own methods. To assess potential bias, we will evaluate declared conflicts of interest and sources of funding. Quality assessment will be discussed in the review, but no study will be excluded from the analysis based on quality criteria.

Data synthesis

The technical methods will be synthesised using the content analysis described above and the studies will be categorised and described using the three domains, depending on study type and reporting. We will present the results in tables along with method and study identification and summarise via descriptive statistics. We will compare the different characteristics within the three domains to identify common, infrequent, or missing features of these tools and extract recommendations for future initiatives.

Patient and public involvement

A representant of a patients’ association was involved in reading and approving of this protocol. This systematic review is part of a larger project that will be developed closely with patients and healthcare providers.

Ethics and dissemination

The search strategy was developed in collaboration with health sciences librarian services in early 2019. Database searches will be initiated in May 2019. The review is expected to be completed by February 2020. Ethical approval is not required. Results will be disseminated in peer-reviewed journals and/or conference presentations. Data used in this review will be made available through online supplementary materials and open trusted repositories.



  • Contributors LSP, ALD and SA designed the protocol and planned data extraction and quality assessment. LSP put together the search strategy and SA helped adapt it to the different databases. LSP and ALD conceived the content analysis stages and conceptual framework. LSP wrote the first version of the manuscript, ALD extensively reviewed it, SA, DD, A-MS and MV revised it critically for important intellectual content. All authors have approved the publication of this protocol and contributed to the final manuscript.

  • Funding DD was supported by an IDEXLYON (16-IDEX-0005) Fellowship grant (2018-2021), LSP was supported by a PhD funding within the same grant, ALD by a Marie Curie Individual Fellowship from the European Commission (MCRA-IF n°706028) during the preparation of this review protocol and SA by the Swiss Science Foundation (P2BSP3_ 178648).

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.