Article Text

PDF

Comprehensive overview of computer-based health information tailoring: a systematic scoping review
  1. Azadeh Kamel Ghalibaf1,
  2. Elham Nazari1,
  3. Mahdi Gholian-Aval2,
  4. Mahmood Tara3
  1. 1 Department of Medical Informatics, Student Research Committee, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Razavi Khorasan, The Islamic Republic of Iran
  2. 2 Department of Health Education and Health Promotion, School of Health, Mashhad University of Medical Sciences, Mashhad, Razavi Khorasan, The Islamic Republic of Iran
  3. 3 Department of Medical Informatics, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Razavi Khorasan, The Islamic Republic of Iran
  1. Correspondence to Dr Mahmood Tara; TaraM{at}mums.ac.ir

Abstract

Objectives To explore the scope of the published literature on computer-tailoring, considering both the development and the evaluation aspects, with the aim of identifying and categorising main approaches and detecting research gaps, tendencies and trends.

Setting Original researches from any country and healthcare setting.

Participants Patients or health consumers with any health condition regardless of their specific characteristics.

Method A systematic scoping review was undertaken based on the York’s five-stage framework outlined by Arksey and O’Malley. Five leading databases were searched: PubMed, Scopus, Science Direct, EBSCO and IEEE for articles published between 1990 and 2017. Tailoring concept was investigated for three aspects: system design, information delivery and evaluation. Both quantitative (ie, frequencies) and qualitative (ie, theme analysis) methods have been used to synthesis the data.

Results After reviewing 1320 studies, 360 articles were identified for inclusion. Two main routes were identified in tailoring literature including public health research (64%) and computer science research (17%). The most common facets used for tailoring were sociodemographic (73 %), target behaviour status (59%) and psycho-behavioural determinants (56%), respectively. The analysis showed that only 13% of the studies described the tailoring algorithm they used, from which two approaches revealed: information retrieval (12%) and natural language generation (1%). The systematic mapping of the delivery channel indicated that nearly half of the articles used the web (57%) to deliver the tailored information; printout (19%) and email (10%) came next. Analysis of the evaluation approaches showed that nearly half of the articles (53%) used an outcome-based approach, 44% used process evaluation and 3% assessed cost-effectiveness.

Conclusions This scoping review can inform researchers to identify the methodological approaches of computer tailoring. Improvements in reporting and conduct are imperative. Further research on tailoring methodology is warranted, and in particular, there is a need for a guideline to standardise reporting.

  • health informatics
  • information technology
  • public health

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Strengths and limitations of this study

  • A considerable number of studies have been reviewed systematically, using theoretical frameworks to determine the data extraction variables.

  • This review highlights research gaps, tendencies and trends to provide a comprehensive overview of the computer-tailoring field with suggestions for future studies.

  • Every decision-based process has been conducted independently by two reviewers with the calibration exercise to ensure the reliability.

  • The subjectivity of the data categorisation process can be considered as a general limitation of the study.

Introduction  

Traditionally, health education materials have been generically produced aiming at providing as much information as possible, regardless of the specific characteristics of prospective consumers.1 This required a lot of time and effort for the health consumers, to find the relevant information they need. The attempts to lower such burden have eventually led to the emergence of tailoring as a new enhancement.

Tailoring has been defined as the process of adapting information to the specific characteristics of an individual.2 The theoretical basis underlying tailoring has its origin from a psychological theory named the elaboration likelihood model.3 It suggests when individuals perceive information to be personally relevant, they process it more thoroughly.4 Numerous studies compared the effectiveness of tailored information to their conventional one-size-fits-all counterparts. The results showed an overall positive effects of tailored materials in different behavioural and clinical outcomes.5–7

The recent advancements in technology made the automatic production of tailored messages possible.8 Computer-based information tailoring has been discussed in the literature since the early 1990s. It uses computerised algorithms to adapt health information to the unique characteristics of its users.9 A number of conceptual models have been developed to describe the components of a generic computer-based tailoring system.2 9 According to Strecher,10 there are three basic requirements for achieving computer-based tailored health communication: (1) individual-level data (eg, demographic characteristics and health beliefs), (2) a tailoring engine (eg, algorithm, logic or rule-based system) that matches the individual-level data to the most relevant messages and (3) a delivery medium (eg, print, web-based or multimedia) for presentation of the educational messages.

Park and colleagues2 suggested that computer-tailoring of health information requires an interdisciplinary team of healthcare professionals and information technologists.2 One important challenge in interdisciplinary research is the differences in perspectives and understanding of the concepts which can lead to diversity and heterogeneity in the field.11

Several studies have reported considerable heterogeneity in tailoring definition and techniques across the literature.6 12–14 There are a range of similar terms and concepts that all reflect the intention to increase the personal relevance of information, sometimes used interchangeably. For example, Doupi and other computer scientists15 use the terms tailored and personalised interchangeably, whereas psychologists and healthcare professionals are more precise in differentiating tailoring from other terms.16 Tailoring systems have also been developed for a wide variety of purposes, such as supporting the patient’s role in decision-making,17 enabling the management of chronic conditions18 and offering health promotion advice.19 Audiences of such systems range from people at risk of developing chronic conditions,20 21 to patients who already had chronic conditions and required long-term continuous treatment18 22 23; or those who underwent more short-term intensive treatments, such as for cancer.24 25 Considering these diversities as well as the rapid growth of the tailoring literature and the lack of consensus among researchers made the field of computer-tailoring complex and challenging for newcomers.

Literature reviews summarise and organise research evidence and provide an overview of the topic.26 Categorising the available approaches along with their characteristics can help researchers understand the diversity and make more informed decisions in planning their studies, interventions and solution developments. A number of systematic reviews have been conducted in the field of computer-tailoring that studied the effectiveness of tailoring.5 6 27–30 They had mainly focused on specific questions and were limited to a narrow scope of studies. According to Hawkins et al, the question of whether tailoring works or not has been sufficiently answered and it is time to address the variety of goals and strategies inside tailoring black box which requires a different sort of research altogether.9 Rychetnik et al 31 suggested that when there is significant heterogeneity of studies, it is more appropriate to describe the variation in findings rather than attempt to combine findings into one overall estimate of effect.31

A scoping review offers a feasible means for a comprehensive synthesis of the literature to map out the evidence and identify the knowledge gaps within the primary studies.32 In this study, we have conducted a scoping review to explore the depth and the breadth of evidence in computer-tailoring considering both the development and the evaluation aspects.

The aims of this study were (a) to provide a holistic view of the available literature on health information computer-tailoring, covering the whole continuum of tailoring (eg, from those that are based on the shared characteristics of a group of people to the ones that are focused on specific needs of a particular user), (b) to identify and categorise the main approaches in developing tailoring systems and (c) to investigate and discover patterns and trends among various approaches. We rigorously followed the scoping review methodological framework introduced by Arksey and O’Malley33 and conducted a broad systematic search in multiple scientific databases. The methodology of this study has been peer-reviewed and published as our protocol paper prior to this study.34

Method

The study’s overall plan was outlined by a framework-based approach. Our methodology in conducting the scoping review was based on the York five-stage framework proposed by Arksey and O’Malley.33 Kreuter tailoring model was used as a basis to identify the aspects of the tailoring process.1 There were three main aspects identified: system design, information delivery and evaluation. To determine the components of these aspects, several conceptual frameworks were explored.35–37 More details on our research protocol can be found here.34 In this section, we provide an overview of the steps.

Stage 1: identifying the research question

We classified the research questions into four groups according to the aforementioned tailoring aspects: (1) questions related to the general characteristics of the study, (2) questions concerning the approaches used in the development of tailoring system, (3) questions related to the information communication issues and (4) questions that address the evaluation approaches across the literature. In accordance with the study goal, we considered a broad definition for the concept of computer-tailoring to cover any application that automatically provides tailored information to the health consumers.

Stage 2: identifying studies

Literature from the relevant disciplines, which was published in English from 1 January 1990 to 24 March 2017, was gathered from five leading databases: PubMed, Scopus, ScienceDirect, EBSCO and IEEE. The search strategy was conducted by two of the authors independently and confirmed by all other team members. It consisted of the following concepts (along with their associated keywords): Individualization (Tailor*, Individualiz*, Personaliz*, Adapt*), Information (Content, Message, Advice, Recommendation, Feedback, Education, Behaviour*), Information and Communications Technology (ICT) platform (Computer*, Mobile apps, short message service (SMS), Web, Internet), Health field (Health*, Medic*, Disease, Patient, Consumer). In addition, we hand-searched literature for additional resources using reference lists, scholars whose names often came up in searches for computer-tailoring; and the 2000–2017 editions of the Journal of Medical Internet Research and Patient Education and Counselling Journal. We used two different metrics to select authors for manual search. First, the top three authors who had the largest number of articles among the included studies were considered for further hand-search. This metric helped us identify the researchers whose research approach were most likely consistent with the selection criteria in our study. However, using this approach, we still might have missed the contributors of the most influential or highly cited literatures (such as review studies or guidelines) which were not eligible for inclusion in our study. In order to avoid this limitation, we also hand-searched on the first authors of the three most cited papers in the field of computer tailoring using the ‘Mazer publish or perish’ tool.38 The Mazer tool allowed us to sort Google Scholar search results based on the number of citations. The publications of the included authors were searched through their profile on Google Scholar on 10 August 2018.

Stage 3: selecting studies

Two reviewers independently screened papers, using a two-step approach: first, the titles and abstracts were examined and then the articles’ full-text were scanned. The inclusion criteria were as follows: tailoring should be done on the content of the information (not the process, services, tools, user interfaces, etc), the process of generating tailored information should be computer-based and the patients or health consumers should be the target recipients of the information.

During the article selection process, two issues arose. The first was that, although our primary aim was to focus only on systems that were specifically developed for the purpose of information tailoring, we found some studies that had applied tailoring as part of a multi-component intervention (eg, patient decision aid). Such cases were also included if sufficient information was provided.

Another issue was due to series papers. That is the term we use to refer to papers that disseminated different phases of a particular research project, through separate papers. Each series papers includes a unique tailoring system that was developed once and evaluated several times using different evaluation approaches. To avoid duplication, the whole package of each series papers were accounted as one row in the data extraction form, while considering multiple columns to store different approaches of it.

Stage 4: charting the data

A data extraction form was developed using Microsoft Excel 2010, consisting of 20 variables organised in four parts. Data charting involved two important tasks: variable determination and data categorisation.

The variables were determined according to the research questions and the underlying frameworks (list of variables is provided in the results). During the process of data extraction, some variables were found to have multiple values (eg, delivery channel which can be through printout, SMS, web, etc) which required several columns for storage. Decision on the optimum number of columns was a challenge as different papers reported varying number of values for a single variable. One option was to consider the maximum number of possible columns for each variable which would lead to a large table of 360×82 (where 82 belongs to the maximum number of data categories), with a high rate of (eg, more than 90%) non-specified values in some columns. This complicates the management of data extraction and analysis. -To avoid this challenge, sparse columns who have values in less than 10% of their cells were excluded. This will be referred to as the rule of 10% hereafter.

The data categorisation was performed by two independent reviewers, simultaneously with data tabulation, with the aim of managing the range of values each variable can take. Categories were iteratively refined by the reviewers through creating new categories and leaving the old ones to be deleted or merged. When the data extraction was completed, the obtained set of categories was handed over to five experts from the fields of medical informatics, health promotion and biomedical statistics, for validity check. Once the final set of categories was confirmed, the reliability was assessed using the Kappa inter-rater agreement.

Stage 5: collating, summarising and reporting results

Both qualitative and quantitative methods were used to synthesise the extracted data. Theme analysis was used as a qualitative technique to summarise the data into categories. For the quantitative analysis, we performed univariate and multivariate frequency distribution analysis. The frequency distribution provided a summary count of the category occurrences within a particular variable. We represented the obtained results in the forms of line, bar and pie charts. Multivariate frequency analysis was used to aggregate the distribution of two or more variables to find the interrelationships between them. We used the IBM SPSS 24 crosstab function and the Microsoft Excel pivot table tool to build and visualise the cross tabulation.

Patient and public involvement

There was no direct involvement of patients or the public in this scoping review of the literature.

Results

The literature search resulted in 1320 citations (the selection process is shown in figure 1). After screening full-text for 410 potentially relevant papers, 360 papers were included (full citation list is available in online supplementary appendix 1). The first three authors who have the greatest number of included studies were: Hein De Vries with 13% (n=45) included papers, Johannes Brug with 6% (n=22) included papers and Corneel Vandelanotte with 6% (n=20) included papers. The first three authors who had the most highly cited papers were: Seth M Noar the author of ref 6 with 1462 citations, Matthew Kreuter the author of ref 1 with 763 citations and Victor J Strecher the author of ref 39 with 759 citations.

Supplementary file 1

Figure 1

The flow of information through the different phases of the review.

Data categorisation

A Kappa value of 0.924 was obtained as an inter-rater agreement on all variables, indicating that the process of data categorising has been conducted reliably. The resulting categories are presented below.

General characteristics

The investigation of the studies general characteristics was done by considering seven variables, of which, four were related to the articles’ metadata information; and the other three belonged to the studies specific characteristics. The journal name, journal’s impact factor, article’s publication year and the corresponding author’s discipline were studied as the paper’s metadata. The data obtained for the numeric variables (such as the journal’s impact factor and paper’s publication year) were categorised into equal intervals. The resulting categories for the author’s disciplines were: psychology and behavioural sciences, public health, education and promotion, computer science, health informatics, medicine with all the related specialties and nursing.

For the study’s specific characteristics, the study type, study location and health domain were investigated. We have followed a coarse-grained approach to categorise the data in each of these variables. Overall, three types of studies were identified: protocol studies, development-based studies and evaluation-based studies. The development-based studies mostly focused on technical or theoretical methodology of developing tailoring systems. The evaluation-based studies had experimental or non-experimental design to evaluate the tailoring system in various aspects. The protocol studies delineated the objectives, background, methods and importance of the research study being proposed.

For the study location, continental divisions were preferred for brevity (ie, Africa, Asia, Australia, Europe, North America and South America).

To provide an illustration of the health domains that were studied in the tailoring literature, four categories were considered: lifestyle promotion (eg, diet change, physical activity and weight loss), addiction cessation (eg, smoking and alcohol cessation), screening (eg, all preventive screening tests) and disease management (eg, medication adherence, symptom management, rehabilitation and so forth). These categories were consistent with the classification presented by Kukafka et al.13

Tailoring System development

The development aspects were investigated through six variables: user profile, data collection tool, behavioural model, tailoring algorithm, type and source of information in the content library. These variables have been derived from the components of a framework proposed by Dijkstra and De Vries.35

There were six dimensions identified from the categorisation of the user profile data. Categories with some examples are provided below:

  • SocioDemographic (SD; eg, name, gender, age, level of education, etc).

  • Health/medical history (eg, comorbidities).

  • Health/medical state (eg, disease severity, current medication, lab tests, etc).

  • Psycho-behavioural determinants (PBDs; eg, attitude, self-efficacy, readiness to change, etc).

  • Knowledge level.

  • History of interactions (eg, click counting, visited links, time spend, etc).

The tailoring systems commonly used a combination of these categories to achieve a multi-dimensional view of the user. Accordingly, based on the rule of 10% that was explained in the method, up to four entries were considered in the data extraction form to record the data for this variable.

From another perspective, the profile data were classified into two categories based on the data variability over time: static and dynamic. SD and the permanent conditions in health/medical history are two examples of static user-information. Dynamic user information (ie, health/medical state, PBDs and knowledge level) require several measurements to monitor the users’ long-term condition.

There are various behaviour change models that guide applying PBDs for tailoring and help the production of more effective messages.6 These models were categorised into two general classes based on whether they were continuum or stage-based.

Another component of the tailoring system is data collection tool/source(s) that is used to collect user profile data and provide the information required as the basis of tailoring. The following categories regarding information gathering tool/source(s) were identified: questionnaire-based, diary-based, device-based and record-based. As a multi-valued variable, based on the rule of 10%, up to two values were permitted for this variable in the data extraction form.

Considering the extent to which data collection tool/sources engages the user in the process of information gathering, they can be categorised into two classes of direct and indirect. The former requires the users to supply explicitly the information and the latter implicitly gathers user information without any effort from the user. The device-based tools (ie, sensors) and record-based tools (ie, medical records) are examples of the indirect data gathering approach. On the other hand, the questionnaire-based and diary-based methods collect information in a direct way.

The collected information in the user profile has to be interpreted by a tailoring engine to identify the relevant content. The two approaches identified for developing a tailoring engine included: information retrieval (IR) and natural language generation (NLG). The IR systems deal with a large content library trying to extract the relevant pieces, whereas NLG systems are based on a knowledge base of rules that enable the system to infer and identify the relevant topics to the user and to generate a tailored output.

The message library, another component of a tailoring system, has been studied through two variables: the information source and information type. The identified sources of information, that were used to supply the content of the library, were laid into four categories: traditional educational resources (eg, booklets, leaflets, pamphlets, etc), online educational resources (eg, the web), expert advice and guidelines. Expert advice refers to motivational messages written by the domain experts, clinicians, health educators, psychologist and so on. Clinical guidelines provide individualised information about risks, symptoms and treatments based on a patient specific data.

The types of information in the library fall into four classes: factual knowledge, feedback (eg, normative, ipsative and summative feedbacks), advice (eg, guidance or recommendations offered with regard to prudent action) and scheduled plan (eg, detailed information about how an activity should be done such as medication adherence, physical activity, diet).

Information Delivery

Based on the Berlo’s Sender-Message-Channel-Receiver model of communication,21 three variables including message format, delivery channel and message frequency were considered to address the issues related to the delivery of tailored information. Text, graph, audio and multimedia were the formats found to represent the tailored content. The delivery channels that were identified to communicate the tailored information to the audiences include email, SMS, mobile app, on-screen, compact disc, web, kiosk and the traditional print formats. For articles that used multiple channels, up to two values could be recorded according to the rule of 10%. The delivery frequency refers to the number of sessions of tailored information provided to the user, which is a bi-valued data, specified with once and multiple.

Evaluation

The design of a tailoring system includes several intermediate and formative evaluation phases, but in this study, we have focused only on summative evaluations. This provides the opportunity to investigate the effectiveness of the tailoring approaches discussed earlier. The evaluation of tailoring systems has been studied through four variables: evaluation indicator, data collection method, evaluation type and evaluation result. These variables were derived from the Human, Organisation and Technology (HOT)-fit evaluation framework of health information systems.40

The indicators, which were used for the evaluation of tailoring systems, were categorised into the following seven categories: user experience (UX), usability metrics, health/medical status, behavioural outcome, PBD, knowledge level and cost. The UX) refers to the user’s attitudes and perceptions regarding using a tailoring system or its product, in terms of ease of use, usefulness and efficiency. According to the Centers for Disease Control and Prevention (CDC), these six categories can once again be summarised into three general classes mainly process evaluation (including UX and usability), outcome evaluation and economic evaluation.37

The methods of collecting data for evaluation have been categorised into four groups: questionnaire, diary, sensor and patient record (eg, electronic health record (EHR)). There were two general categories considered for the type of evaluation: descriptive and comparative. The evaluation results were categorised into three classes of effective, ineffective and partially effective (ie, significant for some outcomes and no difference in the others).

Figure 2 presents a classification map that provides a visual overview of aspects along with the approaches that were identified in this study. We have used XMind V.3.2.1 to develop this map.41

Figure 2

Classification map; an overview of the identified categories in each aspect. CD, compact disc; NLG, natural language generation; UX, user experience. 

Distribution Analysis

To complete our knowledge of the categories and approaches identified in the previous section, the results of the quantitative analysis is presented as follows. The report is organised based on the quadripartite structure used throughout the study.

General characteristics

The publication year of the articles revealed a growing interest in the field of computer-based tailoring from 2012 onwards, with 54% of articles published in this period. Nearly half of the papers (45%) have been published in journals with the impact factor ranging from one to three, with the higher impact factors belonging to the articles that were published in more recent years.

A list of 40 journals that have published one or more of the 360 included articles was composed; the scope of the journals was diverse, from those that have focused on a specific domain of health or medicine, to the ones that have focused on methodological innovation. The majority of articles (16%) belonged to the ‘Journal of Medical Internet Research’ with the impact factor of 4.67 for 2017, followed by the ‘BMC Public Health’ (10%) with impact factor of 2.42.

The distribution of the corresponding authors' discipline revealed that 64% of studies have been held by researchers from the fields of health and medicine (ie, health promotion, public health, behavioural sciences, medicine, nursing and others); while only 17% belonged to the computer scientists. The geographical analysis showed that nearly half of the researches have been conducted in Europe (51%) and North America (34%) (Netherlands leads with 41%).

Among the various health areas that were studied in tailoring literature, lifestyle promotion (39%) and disease management (30%) were the most popular. The bubble graph shown in figure 3 represents the article count for each health domain, grouped by publication year. An increasing trend is recognisable in all domains.

Figure 3

The frequency of articles in each of the health domains categorised by year of publication.

The studies that concerned lifestyle promotion with publication year between 2012 and 2014 formed the largest group. The smallest group related to a study in the area of addiction cessation that occurred during the early years of emergence in computer tailoring.

Tailoring system development

The most preferred user-specific features in the development of tailoring systems belonged to the categories of SD (74%), target behaviour status (TB; 60%) and PBDs (56%), respectively. Each of these categories specifies one dimension of the user characteristics. The articles were also investigated in terms of the variety of dimensions used. The results revealed that all the included articles had used at least one of the above-mentioned categories; a considerable amount of articles (93%) applied two categories of user characteristics, 52% considered three and 8% used four categories to achieve a higher understanding of users.

To understand which user-specific features most often come together, a co-occurrence matrix was constructed (figure 4). The value in each cell shows the count and percentage of the papers that have used characteristics that are specified in the corresponding row and column. Figure 4 shows the co-occurrence matrix for different dimensions of user characteristics. The value of 161 (44%) at the intersection of SD and TB, indicates that in 161 articles that have used SD, TB was also used. On the other hand, in 40% of cases, SD was accompanied by PBDs which itself came with TB in 39% of the occasions. So, it seems that the combination of SD, TB and PBD was mostly used in the user profiling across the literature.

Figure 4

Co-occurrence matrix between categories of user profile. The abbreviations that are used in the table includes: UI, user interactions); UP, user perception; H/M, health/medical; psych, psycho-behavioural determinant; SD, sociodemographic; TB, target behaviour.

In 72% of the articles, the data collection approach was explicitly reported. Among them, the questionnaire-based methods were the most frequent, accounting for 86% (n=307) of the articles. The remaining 14% were equally divided between the three other data collection methods including record-based (5%), device-based (4%) and diary-based (5%).

Apart from the questionnaire that is by far the leading method of gathering data for all types of user characteristics, the bivariate association showed that the record-based data collection approach was mostly used to acquire the health/medical state, health/medical history and demographic data with 14%, 8% and 2%, respectively. Devices like sensors were mostly used to collect data for TB (9%). Details of the analysis are provided in the online supplementary appendix 2, part A.

Supplementary file 2

A variety of cognitive-behavioural theories have been applied in health education and counselling. Regarding the number and type of behaviour models applied in the tailoring literature, we have found that half of the reviewed articles (52%) used at least one behavioural change model, either continuous-based (55%) or stage-based (45%), with an equal ratio. Social Cognitive Theory (45%), Health Behavior Model (22%) and Theory of Planned Behavior (19%), respectively, were most common among the continuous-based behaviour models, and Trans Theoretical Model with 51% was the most prevalent among the stage-based models.

Although the tailoring algorithm is the most important component in developing a tailoring system, the literature was disappointingly scant on this matter. According to our findings, only 13% of articles have reported about the tailoring method they used, of which 12% followed the IR approach and only 1% was based on the NLG techniques, due to its complexity.

The results of analysis in the information library aspects revealed that 78% of papers did not report the information source. Among those that have reported, 13% used guidelines and 7% used expert knowledge. The tailored information type was almost equally distributed among the four identified categories: advice (34%), feedback (24%), fact (24%) and scheduled plan (18%). While feedback determines the user’s current state, the advice provides relevant recommendations on how to improve the condition, thus acting as complementary to each other. The co-occurrence analysis revealed that feedback and advice were used with each other in 18% of studies (see online supplementary appendix 2, part B, for the co-occurrence matrix).

Information delivery

The systematic mapping of the literature, based on the delivery channel, indicated that nearly half of the articles used web (57%) to deliver the tailored information. The second most widely used channel was the traditional print format, which is known as the ‘first generation’ of tailoring systems, accounting for 20% of articles. More detailed subcategories of print formats are manuals (booklets), pamphlets (leaflets), newsletters (magazines) and calendars. A considerable amount of studies (92%) used only one channel to deliver information. Figure 5 represents the channels’ frequencies in the form of a word cloud using the online program Wordle.42 The terms’ sizes indicate their frequency across the literature.

Figure 5

The frequency of delivery channels represented as the word cloud.

To know which channels were more often compared with each other, a co-occurrence matrix was created (see online supplementary appendix 2, part C). The results revealed that the most comparisons have been made between web-email (n=5) and web-print (n=6). Another important point to be concluded from the co-occurrence matrices is identifying the research gaps shown by cells with zero value where no primary research exists.

The article pool was divided into two equal groups based on whether they have reported the representation format of tailored information or not. Among the half that have reported, 71% (n=43) used text-only format and 15% provided the tailored information in a graph-only format.

The number of information provision sessions was reported in only 20% of the articles from which 28% provided information in a single session and 72% through multiple sessions.

Evaluation

The evaluation was conducted in 79% of all articles (the remaining 21% included development articles (7%) and protocol studies (14%)). The percentages presented in the following paragraphs belong to the subset that conducted evaluation.

The distribution of articles according to the classes of indicators were as follows: process evaluation (44%), outcome evaluation (53%) and cost evaluation (3%). In 27% of the articles, both process and outcome evaluations were conducted, with the aim of benefiting from the early feedbacks of process evaluation as a guide to better design the outcome evaluation. In the following , each of these approaches is explored in-depth to identify the most common indicators.

Of the 44% articles that used process evaluation, 27% were focused on assessing the user’s attitude, and 18% were related to the assessment of usability issues. From the 52% of articles that evaluated the outcomes of tailoring information, 18% focused on behavioural outcome, 12% studied the health/medical outcomes, 15% assessed PBDs and 7% measured the user’s knowledge level.

To better understand which user-specific features most often come together, a co-occurrence matrix was constructed (figure 6). The value in each cell shows the count and percentage of papers that have used the user characteristics that are specified in the corresponding row and column. The results from the co-concurrence matrix revealed that the majority of the articles, evaluated both, user attitude and usability together (n=43, 12%). In other words, if an article evaluated the user’s attitude, it was more likely (12%) to evaluate the usability issues as well. Most articles (n=49, 14%) that focused on TB outcomes, assessed the psychosocial determinants as well (figure 6).

Figure 6

Co-occurrence matrix between categories of evaluation indicators. H/M, health/medical; psych, psycho-behavioural determinant.

The data collection tool(s) was (or were) reported in 52% of articles, of which 42% used questionnaires and 10% collected the data through interviews. The sensor and log file accounted for 2% of the articles each. Approximately half of the articles (53%) conducted comparative evaluation, whereas 26% carried out descriptive evaluation. The results of the comparative evaluation showed that in 61% of articles tailoring was effective, in 21% ineffective and in 18% it was partially effective.

Discussion

In this study, we adopted the Arksey and O’Malley five-stage methodological framework as a guide to conduct and report the first scoping review in the field of health information computer-tailoring.

In this section, the implications of the results have been discussed and some directions for future study have been offered. The rest of this section is organised into five parts: general characteristics, system development, information delivery, evaluation and strength and limitations.

General characteristics

The analysis of the corresponding authors' disciplines revealed two main routes in tailoring literature: public health research and computer science research. Features of these two approaches are discussed below.

The public health researchers relied largely on health behaviour models and generally used simpler technological approaches; whereas the researchers from the computer science field employed more advanced technological approaches and lesser integrated behaviour theory. These two approaches have the potential to complement each other to produce an overarching, advanced method of tailoring for future studies. Basing the analysis on the corresponding author’s discipline as the most important contributor of the research team may not reflect the composition of the entire team and it is acknowledged as a limitation of this study.

We discovered a growing attention of tailoring literature towards the fields of lifestyle promotion and addiction cessation (60%) which can be interpreted as a promising thread in helping the society reduce the burden of high mortality rate due to unhealthy behaviours, including tobacco use, unhealthy diet and lack of physical activity, as warned by the CDC.43 On the other hand, despite the fact that psychological barriers are important impediments to preventive healthcare behaviours such as screening,44 the results showed that a few articles used the persuasive power of tailoring to encourage people to undertake screening tests (9%) which should be explored more through future research.

System development

The majority of the tailoring literature collected the user-specific information using direct methods such as questionnaire or checklist (93%). These methods are fairly limited for two reasons: (1) users generally dislike the idea of having to exert the extra time and effort supplying information to a system, and (2) a user may not always be able to provide accurate answers.45 The indirect data sources such as sensors, EHRs and log files constitute a much lower proportion of tailoring articles (9%), in which the user is not directly engaged in the process of collecting information.

Harvesting information from the interactions of users with an external resource (ie, search engines and social applications) can be considered a new thread for future studies. This approach can be especially useful for stigmatised health problems like HIV, where confidentiality is of particular importance. The study conducted by Cunningham et al 46 collected anonymous data from the internet to generate tailored information for people with problematic alcohol consumption.46

A study conducted by Vosbergen and his colleagues on Coronary Heart Disease patients used a different technique for user profiling called stereotyping.47 A stereotype or persona allows building a user profile without asking too many questions from the user which is appealing for future studies.

The results showed that most of the tailoring research (93%) used multiple dimensions of user characteristics to achieve the necessary knowledge about the user. This increases relevancy but also leads to increases in costs. According to Kreuter,1 there is no agreed amount of user information for the tailoring system, rather it depends on the purpose of the application. More likely, there exists a point of diminishing returns wherein there is a certain limit to which assessment data can be collected and used productively in tailoring. The goal is, thus, to find a balance between the message specificity, the time spent to answer assessment questions and the related costs. Tailoring interventions would be the most cost-effective when the least number of determinants were found to predict the greatest amount of change in the outcome of interest.2 Although using multiple methods to collect user-specific data is more beneficial,1 most articles (64%) used only one method.

As the results revealed, a significant percentage of articles had not addressed the basics of the methods used in the tailoring algorithm. It is assumed that as most tailoring systems were based on simple decision rules (ie, if-then rules) researchers may have overlooked its reporting. One disadvantage of using simple if-then rules is that the developer has to anticipate every possible variation. The resulting texts may also lack coherence and require post-editing. There are, therefore, some concerns to find better techniques.

From a technical point of view, a tailoring engine can be considered as a knowledge-based decision-making system. This suggests a variety of techniques from the field of computer science regarding knowledge acquisition, representation and reasoning. Since the area of computer tailoring can be recognised as a hybrid fusion of methods from both, public health and computer science; future studies are advised to integrate techniques from these disciplines.

In contrast to the simple tailoring methods that produce coarse-grained tailored materials, the technologies such as artificial intelligence enable fine-grained and flexible tailoring in both content and presentation aspects. The fine-grained tailoring is when the information is selected at the level of individual sentences, phrases and even words, whereas coarser grain is when the whole paragraphs of information is selected.48

In a majority of studies, tailoring was performed over a predetermined library of health information. In addition to the burden of collecting such a huge amount of content, the rapid growth of healthcare knowledge also poses a challenge in keeping this library up to date. Recent research has opened up new possibilities of using the wide range of health information resources on the internet. Doupi and van der Lei conducted a research named Structured Evaluated Personalized Patient Support, in which the patient information in the EHRs was used as a basis to find the most relevant health information on websites.49

An important aspect in the design of a content library is the structure in which information is organised. This is referred to as information representation in computer science literatures. Very few tailoring studies have described the information representation method of not only the message library, but the user profile and the tailoring knowledge-based.50 Several data representation models have been introduced in computer science, such as vector-based or semantic network-based model that have the potential to be employed in future studies of health information tailoring.51

The huge amount of missing information for some critical variables such as tailoring algorithm, sources of library information and data collection tool, with 87%, 78% and 48% missing, respectively, reveal a serious weakness in the reporting of tailored literature. Several prior studies criticised this insufficiency of information.6 8 14 52 Standardising the reporting not only improves understandability of the content, but also allows one to better judge the validity and reliability of the study. Future tailoring studies may better report the details of how tailoring is enacted using reporting guidelines14 53 54 in order to disentangle the ‘black box’ of tailoring. We would like to call the researchers from various related disciplines to establish a standard reporting guideline for the development, implementation and evaluation of original researches in the domain of computer-based tailoring. We also like to call all the respected journals to adopt these standards afterwards. Any attempt in this regard should be acknowledged in light of the findings from the previously published reporting guidelines.

Information delivery

The interactive, multimodal characteristic of the web, such as hyper links, images, videos and audio, in addition to the enormous power for disseminating health information, made this platform an ideal choice (53%). Despite the popularity and the growing trend of using the newer digital channels (ie, social media) among health professionals,55 no tailoring study has addressed it yet. Perhaps, future studies can investigate the efficacy of such channels.

We discovered that the dominant format to present tailored information was text (71%) which is not surprising as text is the most globally common form of information presentation. In this regard, Kreuter et al argued that a good visual design can be as important to the success of a tailored communication as the message content itself.1 This highlights the importance of adding more graphical elements to text for future studies.

The meta-analysis conducted by Noar et al suggested that studies that provided more contacts with participants were more effective in stimulating health behaviour changes.6 Consistent with this, we found that 72% of articles delivered tailored information through multiple sessions, thus indicating that the literature is moving in the right direction.

Evaluation

The results showed that the tailoring systems varied from simple practical systems being evaluated in realistic context to more experimental systems using a formal laboratory test.

Although a high tendency is noted toward the comparative outcome evaluation (53%), there exist a significant number of articles that focused on process evaluation (44%) which was neglected by the preceding review studies. The results showed that among the articles with comparative evaluation, 60% reported tailoring to be effective. Considering the effort that goes into creating tailored messages, the effects must be substantial enough to warrant investment in tailoring technology and the individualisation of messages. However, there is a concern regarding the report of study effectiveness without considering the details of the intervention context (eg, control/comparison groups, what they receive and other components). So, it is acknowledged as a limitation of this study.

Considering the popularity of the IR approach in tailoring studies, it is perhaps worthwhile for future researchers to evaluate system performance using IR-based indices such as precision and recall.

Strength and limitations

We adhered to the methodological framework that was developed and published prior to this study34 and even went further by extracting nine additional variables other than those we had proposed in the protocol. Nevertheless, like other scoping reviews, this study is also subjected to inherent limitations, such as focusing on the breadth of information rather than its depth.

Restricting the number of columns to store multiple valued variables was an unavoidable limitation of this study. The subjectivity of the data categorisation process is also a limitation; however, we tried to increase the reliability by performing iterative pilot tests and expert panels throughout the process. Despite a significant amount of literature being systematically examined in this study, there is no claim that the obtained categories are definite or comprehensive.

The primary strength of this study is its comprehensiveness in terms of search resources, article selection, data extraction and data synthesis. This study addressed the tailoring literature from a new perspective, focusing on the methodological aspects instead of the evaluation results. The classification of tailoring concepts and techniques in this study can be considered as a foundation to develop a taxonomy for computer-tailoring health information.

We aggregated tailoring studies regardless of their research type to organise them into categories and analyse their distribution across the literature. We did not limit the scope of this study to a specific level of tailoring and have covered the entire range, from personalised, and targeted, to tailored studies. This broad coverage, though provides the opportunity to study various approaches under a common frame, but limits the specificity of the results. This was partly inevitable as tailoring terms had often been used interchangeably in research literature.

We believe that this study has contributions both for the field of information tailoring and scoping review and has the potential to serve as a stimulus to theoretically and practically useful research on tailoring.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.
  52. 52.
  53. 53.
  54. 54.
  55. 55.
View Abstract

Footnotes

  • Patient consent for publication Not required.

  • Contributors MT and MG-A conceptualised the review approach and provided general guidance to the research team. AKG and EN reviewed the search results independently and carried out the data extraction and analysis. MG-A contributed to the analysis and interpretation of the findings. AKG initiated the first draft of the manuscript which was then followed by numerous iterations with substantial input and appraisal from all of the authors. MT supervised the entire process, performed the final touch and approved the final version of the manuscript.

  • Funding This work was supported by Mashhad University of Medical Sciences Research Council grant number [# 950392].

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Readers interested in using our database on computer-based tailoring for specific purposes related to their respective research are invited to do so by contacting the first author through the corresponding email address.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.