Article Text

This article has a correction. Please see:

Download PDFPDF

mHealth app using machine learning to increase physical activity in diabetes and depression: clinical trial protocol for the DIAMANTE Study
  1. Adrian Aguilera1,2,
  2. Caroline A Figueroa1,
  3. Rosa Hernandez-Ramos1,
  4. Urmimala Sarkar2,
  5. Anupama Cemballi2,
  6. Laura Gomez-Pathak1,
  7. Jose Miramontes2,
  8. Elad Yom-Tov3,
  9. Bibhas Chakraborty4,5,6,
  10. Xiaoxi Yan4,
  11. Jing Xu4,
  12. Arghavan Modiri7,
  13. Jai Aggarwal7,
  14. Joseph Jay Williams7,
  15. Courtney R Lyles2
  1. 1School of Social Welfare, University of California Berkeley, Berkeley, California, USA
  2. 2UCSF Center for Vulnerable Populations in the Division of General Internal Medicine San Francisco, Zuckerberg San Francisco General Hospital, San Francisco, California, USA
  3. 3Microsoft Research, Herzeliya, Israel
  4. 4Centre for Quantitative Medicine, Duke-National University of Singapore Medical School, Singapore
  5. 5Department of Statistics and Applied Probability, National University of Singapore, Singapore
  6. 6Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA
  7. 7Computer Science, University of Toronto, Toronto, Ontario, Canada
  1. Correspondence to Dr Caroline A Figueroa; c.a.figueroa{at}


Introduction Depression and diabetes are highly disabling diseases with a high prevalence and high rate of comorbidity, particularly in low-income ethnic minority patients. Though comorbidity increases the risk of adverse outcomes and mortality, most clinical interventions target these diseases separately. Increasing physical activity might be effective to simultaneously lower depressive symptoms and improve glycaemic control. Self-management apps are a cost-effective, scalable and easy access treatment to increase physical activity. However, cutting-edge technological applications often do not reach vulnerable populations and are not tailored to an individual’s behaviour and characteristics. Tailoring of interventions using machine learning methods likely increases the effectiveness of the intervention.

Methods and analysis In a three-arm randomised controlled trial, we will examine the effect of a text-messaging smartphone application to encourage physical activity in low-income ethnic minority patients with comorbid diabetes and depression. The adaptive intervention group receives messages chosen from different messaging banks by a reinforcement learning algorithm. The uniform random intervention group receives the same messages, but chosen from the messaging banks with equal probabilities. The control group receives a weekly mood message. We aim to recruit 276 adults from primary care clinics aged 18–75 years who have been diagnosed with current diabetes and show elevated depressive symptoms (Patient Health Questionnaire depression scale-8 (PHQ-8) >5). We will compare passively collected daily step counts, self-report PHQ-8 and most recent haemoglobin A1c from medical records at baseline and at intervention completion at 6-month follow-up.

Ethics and dissemination The Institutional Review Board at the University of California San Francisco approved this study (IRB: 17-22608). We plan to submit manuscripts describing our user-designed methods and testing of the adaptive learning algorithm and will submit the results of the trial for publication in peer-reviewed journals and presentations at (inter)-national scientific meetings.

Trial registration number NCT03490253; pre-results.

  • telemedicine
  • health informatics
  • depression & mood disorders
  • diabetes & endocrinology

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • Novel approach of targeting diabetes and depressive symptoms using a smartphone application.

  • Ability to compare adaptive messaging for increasing physical activity to messages delivered with equal probabilities.

  • Testing of a smartphone application integrated within primary care settings in a low-income vulnerable patient population.

  • Longitudinal design with 6-month follow-up enables assessing intervention effects over time.

  • Challenges of this trial include supporting users in key behaviour change in an automated manner with minimal in-person support.



Both depression and diabetes are highly prevalent, often co-occurring diseases that are among the major causes of global disability.1–3 Comorbid diabetes and depression are associated with a worse prognosis of both diseases, including a higher rate of complications of diabetes, greater disability and an increased risk of mortality.4 Vulnerable populations, including low-income, low health literacy and ethnic minority individuals experience higher prevalences and worse outcomes for both diabetes and depression.5 6 There is a great need for the development of treatments that can target overlapping risk factors for diabetes and depression. A growing body of evidence suggests that physical activity (PA) is such a risk factor: it is linked to both mental health and diabetic outcomes7–11

Mobile applications have been found effective in helping patients engage in healthy behaviours including PA. For instance, a recent meta-analysis of nine RCTs concluded that smartphone apps that focus on PA have a moderate positive effect on increasing PA levels, and another meta-analysis including 18 studies moderate-to-large effect in daily step changes.12 13 These effect sizes are similar to ‘face-to-face’ interventions.14 However, because around 70% of lower income Americans (including Latinos) currently own smartphones,15 and the ownership of smartphones is expected to increase in low-income populations globally,16 mobile apps have great potential to reach individuals that normally do not have access to care. Further, mobile technology can help overcome existing barriers in access to care for vulnerable populations, including lower availability of psychological treatment in primary care settings, language and literacy barriers, stigma, cost and inflexible employment schedules.17 Latinos in the USA in particular show higher underutilisation rates of mental health treatment than non-Latino whites.18 Deploying effective mobile applications can therefore decrease existing decrease disparities in health.

However, in the USA, low-income minority patients frequently receive their care in safety net settings (services in the public sector for those unable to attain private health insurance), where novel mobile technologies are not often designed, developed or implemented.19 This translational gap increases the probability that these interventions will ultimately fail when implemented in actual clinical settings that serve vulnerable populations.

Further, most mobile applications that target behavioural changes are not personalised,20 which could contribute to lower effect sizes of these interventions for trials with longer study durations (eg, over 3 months).12 Smartphone interventions allow for data collection by passive sensing technologies, which offer an opportunity for tailoring and personalising interventions to users’ behaviours, preferences and needs. Personalisation can be achieved by computer tailoring: tailoring interventions to observed behaviour and characteristics of the participant. This can include feedback, goal setting or user targeting (ie, conveying that communication is designed specifically for the user).21 However, more complex forms of personalisation might be needed to increase and maintain engagement with PA interventions,22 a requirement for a digital intervention to be effective.

One promising approach is to use adaptive learning, which allows the prediction of which content might be effective for users, learning from its previous actions and participant data collected by mobile phones.23 For instance, an adaptive learning algorithm might be more effective as it can match motivation needs for each participant. Some intervention studies have attempted cultural tailoring to large groups of people. However, most tailoring approaches occur at the group level whereas we are bringing the tailoring down to the individual level.24 Research on the efficacy of these adaptive treatments is still in its early stages.

Aim and hypotheses

The main aim of the ‘Diabetes and Mental Health Adaptive Notification Tracking and Evaluation’ trial (DIAMANTE) is to test a smartphone intervention that generates adaptive messaging, learning from daily patient data to personalise the timing and type of text messages. We will compare the adaptive content to (1) a uniform random messaging intervention, in which the messaging content and timing will be delivered with equal probabilities (ie, not adapted by a learning algorithm), (2) a control condition that only delivers a weekly mood message. The primary outcomes for this aim will be improvements in PA at 6-month follow-up defined by daily step counts.

We will investigate the following specific hypotheses:

Primary hypothesis

We expect that the group receiving adaptive messaging will have a statistically significant higher increase in step counts over the 6-month intervention period, compared with both the group receiving the uniform random messaging and the control group receiving only a weekly mood message. We expect that both the adaptive and the uniform random messaging groups will have a higher increase in step counts than the control group.

Secondary hypotheses

  • We expect that the group receiving adaptive, personalised messages will have significant reduction in haemoglobin A1c (HbA1c) levels, depression scores, weekly mood ratings, higher behavioural activation, lower rumination and anxiety, more positive attitudes about PA and higher self-efficacy to manage chronic diseases at 6 months, compared with the group that receives the uniform random messaging intervention and the control group.

  • Exploratory: because of the unique nature of the patient population enrolled in our study, we expect to find differences by language (English vs Spanish). Language has been shown to be especially influential in mobile technology uptake and effectiveness in our previous work.25 Spanish speakers may engage more in the mobile intervention and might have different motivations for becoming physically active, which might result in differences in effectiveness of the various messages.

Methods and analysis


This study is a randomised, controlled, single-centre superiority trial with three groups and a primary endpoint of increase in daily number of steps during a 6-month intervention delivered by a smartphone app. Randomisation will be performed as block randomisation with a 1:1:1 allocation. Patients will be automatically randomised into groups through our secure server during onboarding of the app, thereby ensuring allocation concealment. Patients will be informed of the nature and frequency of the messages they will be receiving and will be allowed to discuss this with investigators during the course of the study.

Further, if messages are not being sent out appropriately, research assistants will contact the app developer to address errors within 24 hours. If PA data are not coming in, they may need to contact participants to ensure that the app is actively running on their phone, and assist participants with redownloading the app if necessary. The necessity of these steps makes it unfeasible to blind the researchers. We used the Standard Protocol Items: Recommendations for Interventional Trials checklist when writing our report.26


Patients will be recruited within various clinics in the San Francisco Health Network, including the Richard Fine People’s Clinic, the Family Health Center and from several endocrinologist providers at the Zuckerberg San Francisco General hospital (ZSFG); as well as the Department of Public Health in San Francisco, California in the USA that will serve as the study sites for this trial. These clinics serve a diverse, chronically ill population and, as in many public hospital settings, they employ multiple part-time providers, creating the risk for lack of continuity of care that mobile health programmes may be able to supplement. Given that patients at the San Francisco Health Network were not routinely going to clinics for in-person visits at the height of the pandemic in 2020 and early 2021, we had to alter recruitment strategies (Ramos et al,27 2021). In order to meet our recruitment targets within the funding timeline, we added a new online recruitment sample to the trial. Specifically, we placed targeted ads on Craigslist, Facebook, and Google to market the study to English- and Spanish-speaking individuals with diabetes living throughout the U.S. These social media ads targeted major metropolitan areas of the US with high proportions of patients with diabetes and/or larger proportions of Spanish-speaking residents, to have some similarities with the San Francisco Health Network recruitment sample.

Inclusion criteria

Patients aged 18–75 years who have been diagnosed with diabetes and documented depressive symptoms (score >5 on the PHQ-8), as determined by their medical health records, will be eligible for this study. If the patients’ health records show a diagnosis of diabetes but do not include information on PHQ-8 scores or depression diagnosis, we will assess patients’ PHQ-8 scores by telephone to determine their eligibility. After the assessment, this score will be uploaded into the patient health records, regardless of their eligibility for the study. Further, the researchers will reach out to the patient and his/her primary care physician if participants screen >20 on the PHQ-8 (severe depression). Eligible patients will need to use text messaging and have an iPhone or Android smartphone in order to download the pedometer app onto their phones, but they do not need to be proficient in using mobile phone applications. Concomitant care and interventions are permitted during the trial. Importantly, the inclusion criteria for the study were unchanged with this additional sampling. All participants recruited online screened positive for depressive symptoms (PHQ-8>5), as well as self-reported a diabetes diagnosis from a medical professional (Bang et al,28 2009). In addition, all online participants completed full informed consent and verified smartphone ownership by downloading the DIAMANTE app prior to study enrollment.

Exclusion criteria

We will exclude patients with: an inability to exercise due to physical disability; active psychosis or mania; active suicidal ideation; severe cognitive impairment; inability to read and write in English or Spanish; current pregnancy; plans to leave the country for extended periods of time during the 6-month trial.


Patients will be recruited using flyers placed in the waiting rooms, direct provider referrals and in-person recruitment strategies timed to visits. To identify potentially eligible patients, we will ask for permission from known providers to pull patient lists, and review and identify patients with dual diagnosis of diabetes and depression that would fit our inclusion criteria (eg, is able to walk and is not pregnant).

Baseline visit

A researcher or healthcare provider will ask patients if they are interested in joining a study to increase PA and help manage their mood using their mobile phone. We will bring interested individuals into our offices at ZSFG for a baseline study visit and to obtain informed consent in Spanish or English. Thereafter, we will collect all baseline survey measures of interests (outlined below) as well as patient demographics and information about current mobile technology familiarity and utilisation. All patients will receive assistance in downloading a pedometer application onto their phone and will send test text messages back to our system. The researcher will construct a plan for PA goals with the patient and instruct patients to have the app open at all times. Thereafter, users will be automatically randomised by the secure server, using a block randomisation with block size 3 to allocate individuals into either the uniform random, adaptive or control group.

Partial patient and public involvement

Patients worked with us to design the text messages, mobile app user interface and were asked to assess the burden and time commitment of the study, as part of our user-centred design approach. We did not involve patients in other areas of the study design.


Our primary outcome, change in daily step counts, will be passively collected by a mobile phone application during the time that patients remain in the intervention. For secondary outcomes, we will derive HbA1c, the average plasma glucose over the previous eight to 12 weeks, recommended as a means to diagnose diabetes,29 from patients’ electronic health records (EHR). We will use the most recent, available measurement from a maximum of 12 months before participating in the study. After 6 months, we will again assess the most recent HbA1c (pulling from patients’ EHR), ensuring that at least 3 months elapsed between baseline and follow-up HbA1c levels. For additional secondary outcomes, we will administer a survey at baseline and 6-month follow-up. With the addition of a new online recruitment pool geographically dispersed across the U.S., we were no longer able to capture clinical outcomes among all participants, specifically hemoglobin A1c (HbA1c) measurement. Therefore, HbA1c was removed as a secondary outcome for the trial. Importantly, the primary outcome of step count collected via the smartphone app, as well as secondary mental health and usage/usability outcomes, were not affected.

The project coordinator and research assistants will be responsible for managing patient data collection. Data will be stored on UCSF Qualtrics and the HealthySMS platform. Once data from all patients are collected, it will be stored on UCSF’s Secure Box, a secure cloud-hosted platform. See table 1 for all included questionnaires.

Table 1

Questionnaire measures obtained during the study

Engagement measures

In addition to PA and the measures mentioned above, we will also examine engagement measures, such as (1) times that the app was opened (2) time spent reading the messages (3) usability data, assessed by the Systems Usability Scale30 and open-ended qualitative questions about their opinions of the app.


Determining content of motivational message categories

We designed motivational text messages according to a behavioural theory framework, using the Capability, Opportunity, Motivation, Behavior (COM-B) model.31 COM-B is a behavioural change model that proposes that engaging in a particular behaviour depends on the dimensions of capability (physical and psychological), opportunity (social and physical) and motivation (need to engage in the behaviour more than in other behaviours). Messages were designed to fit into these three dimensions. Before the start of the RCT, English and Spanish speaking patients tested the messages in the Amazon Mechanical Turk (MTurk) crowdsourcing platform. Patients on MTurk were given random messages and asked to categorise them into our overarching theoretical categories. MTurk workers and each coauthor also rated messages for their overall quality from 1 to 5.

User-centred design testing

We used user-centred design (UCD) methods32 to iteratively develop the content and text messaging system. We conducted three iterative phases of UCD with 10 patients each (total n=30). The first phase consisted of 1.5 hour individual semistructured interviews that took place at the offices at ZSFG. Findings from phase 1 were used to inform content and information delivery decisions of the final intervention including selecting the thematic message categories and the design decisions. The second phase ran as a usability test in which patients tested out an early prototype of the mobile application. The third phase tested out the final DIAMANTE intervention including thematic message content and finalised application in order to address any user-related issues prior to launching the randomised control trial (see figure 1 for an overview of the different UCD phases and the RCT).

Figure 1

Overview user-centred design process (UCD) and randomised trial. We first conducted three iterative phases of only UCD with 10 patients each (total n=30). The Diabetes and Mental Health Adaptive Notification Tracking and Evaluation trial will have different intervention groups: adaptive (n=92), static (n=92) and control (n=92). RCT, randomised controlled trial.


We employ a mobile phone app, ‘DIAMANTE’ developed by Audacious Software ( This application tracks step counts by pooling from Google Fit, Apple HealthKit or the built-in pedometer on patients’ phones. We use a text-messaging platform HealthySMS, developed by Dr. Aguilera, to send text messages and manage patient responses back to our system. The app only needs to be installed once, but has to remain open consistently. The app is designed in English and Spanish versions and is freely available as a download from the Apple App Store and Android Google Play application.

Figure 1 shows the different intervention groups during the trial period. Briefly, both the adaptive and uniform random group will receive the same types of messages: feedback (four active categories plus no message) and motivation (three active categories plus no message). However, the message categories, timing and frequency will be optimised by a reinforcement learning (RL) algorithm in the adaptive group, and will be delivered with equal probabilities in the uniform random group (following a uniform random distribution). The control group will not receive these messages (only a weekly mood check-in message). All groups will have the app downloaded on their phone and their steps will be passively tracked within the app. See figure 1 for an overview.

Control condition

Control patients will only instal the app on their phone and will not receive any feedback messages. They will receive one message a week, on a fixed day, asking them to assess their mood in the previous week on a scale of 1–9. The message will be sent weekly at 10:00. Non-responders will receive reminders to submit their mood self-assessments in 2 hours intervals.

Uniform random message arm

We will send patients up to two messages per day within four randomly selected time intervals. These messages are based on the COM-B framework (examples shown in table 2A,B). In addition, they will receive one message, on the seventh day, which will ask patients to rate their mood on a scale from 1 to 9. PA (step-count/day) will be passively monitored via the app on their smartphone.

Table 2

Daily motivational and feedback messages DIAMANTE study

Adaptive message arm

Patients in the adaptive messaging arm will receive the daily COM-B messages (equal to the uniform random arm, examples shown in table 2A,B), but the message categories, timing and frequency, will not be chosen randomly, but by using an RL algorithm. This allows us to adequately assess whether differences in effects are driven by the use of the RL algorithm. In addition, they will receive one message, on the seventh day, which will ask patients to rate their mood on a scale from 1 to 9. PA (step-count/day) will be actively monitored via the app on their smartphone.

Patients in all groups will receive reminders to open the app if no data are being transmitted. Additionally, patients in all groups can reply ‘STOP’ or ‘PARAR’ if they wish to stop receiving messages. Finally, the researchers will monitor the incoming step data. The researchers will contact the patients by phone for troubleshooting if patients’ phones are not transmitting data for more than 3 days.

Adaptive learning algorithm

We developed an RL algorithm (an area of machine learning) based on previous work.33 On a daily basis, this algorithm assesses (1) which feedback message (table 2A), (2) which motivational message (table 2B) and (3) which time period of the day (in intervals of 2.5 hours from 09:00 to 19:00) is predicted to maximise the number of steps walked the next day. We use algorithms for contextual multiarmed bandit (MAB) problems, as these have been employed in different domains (including mobile health) and show promise to maximise cumulative rewards in sequential decision tasks (here, which sequences of messages optimally promote PA).34 A contextual MAB problem is an RL setting in which the algorithm chooses between different treatment options which all have different reward functions. The reward functions depend on contextual variables.

We use Thompson Sampling, a Bayesian method, which will allow us to continuously learn which feedback and motivational messages are effective for a user, based on contextual features like their previous PA, demographic and clinical characteristics (such as age, gender and PHQ-8 scores). Thompson sampling can effectively deal with small amounts of data and addresses the exploration/exploitation tradeoff.35 As such, it frequently picks out from the most rewarding messages and occasionally explores the messages with uncertainty in their reward.

More specifically, each morning, the algorithm evaluates which messages will likely increase the PA for every participant in the upcoming day, and at which time period the messages should be delivered. The algorithm training data consist of the historical data of all participants (contextual variables), which include which messages were sent previously and within which time periods, and select clinical/demographic data (such as age, language and depression scores) to improve prediction abilities.

During the trial, patients will receive messages for 6 months. After 6 months, we will provide patients the possibility to remain receiving messages for up to 9 months.

Statistical analysis plan

Baseline characteristics

We will provide descriptive summaries and examine any potential imbalances between the intervention and control groups using χ2-tests for categorical variables and independent samples t-tests for continuous variables.

Primary and secondary hypothesis

Analyses will be conducted on an intention-to-treat basis. We will use full-information maximum likelihood to handle missing data, which has been shown to be preferred over multiple imputations.36 We will include patients in the analysis for primary and secondary outcomes that have at least 1 month of data available. The primary outcome measure, increase in daily step counts, and the secondary outcomes measures (HbA1c, PHQ-8 and other questionnaire data, see the Measures section) will be analysed using a conditional growth model, in line with previous work,37 with intervention (adaptive, uniform random, control) being the fixed effect and time being nested within individuals. Let j=1, …, n, i=1, …, t, where t is the number of recorded days for the jth patient and Yij is the daily steps count on the ith day of jth patient.

Let Aj=2, 1 or 0 for patient j in adaptive, uniform random or control arm, and let Tij be the ith day for patient j.

The daily steps for patients given the treatment, in equation form:

At level 1 (within patient):

Embedded Image

At level 2 (between patient):

Embedded Image

Embedded Image

As this is a randomised trial, we will not adjust for any patient characteristics or patient engagement measures in the primary intention-to-treat analysis unless there is a significant imbalance between the arms of the trial.

Preplanned subgroup analyses

We will conduct an exploratory subgroup analyses for our primary and secondary hypotheses for English versus Spanish speaking patients. We expect at least 50% of the sample to be Spanish speakers given our previous testing recruitment. Finally, given the potential differences in study samples from the San Francisco Health Network and the online social media recruitment pool, we added a new secondary analysis to examine the impact of the intervention stratified by these two recruitment sources. The primary analysis examining the impact of adaptive and random text messaging within the RCT arms was unaffected by the additional recruitment source.

Power calculations and sample size

Power for growth models were calculated based on Monte Carlo simulations using the ‘nlme’ package in R.38 We conducted simulations, using data from our own study team testing the app (n=47) and data from pilot patients (n=8) to estimate the model parameters for calculating the necessary sample size. We used an estimated mean increase of 1250 steps based on previous work.39 40 With 5000 Monte Carlo simulation runs, number of days for each patient t ∼ Uniform(28,180), we estimate that we need n=160 for 2-arms (the adaptive vs uniform random) for 80% power. Inflating to include the control arm and controlling for 15% drop-out, we aim to recruit a total of 276 patients. We believe that low drop-out is feasible as (1) the primary outcome of interest will be passively assessed and thus requires limited proactive engagement by the user, and (2) we have a comprehensive protocol to automatically text users who have closed the app or are not transmitting data.


Patients will receive a compensation of US$40 for participation in the baseline part of the study, and an additional US$70 for completion of the 6-month follow-up.

Data statement

We will submit study results for publication in peer-reviewed journals and presentation at (inter)national meetings, taking into account relevant reporting guidelines (eg, Consolidated Standards of Reporting Trials41). We will attempt to publish all findings in open-access journals when possible, or in other journals with a concurrent uploading of the manuscript content into PubMed Central for public access. Curated technical appendices, statistical code and anonymised data will become freely available from the corresponding authors on request.

Safety protocol

We will review the clinician dashboard for patients in all arms of the trial to identify safety concerns that may emerge over the course of the trial. In addition, we will be automatically notified when key events, such as self-reported suicidality (messages with the words “Kill, Die, Suicide, Don’t want to live, Do not want, Bridge, What is the point, Harm, Hurt, Gun” for English speaking patients; and “Tiro, Matar, Morir, Suicidio, suicidar, no quiero, puente, para que, daño, dañar, dano, danar” for Spanish speaking patients) occur. We will provide immediate phone outreach and will refer patients to urgent or emergency care as necessary. Dr. Aguilera will be available to consult and intervene as necessary for mental health emergencies.

Ethics and dissemination

The UCSF Institutional Review Board (IRB) has approved this protocol. The informed consent form for this study can be found in the online supplementary file. All protocol amendments will be communicated for approval to the USCF IRB. We will ensure that our text messaging content is publicly available through a creative commons licensing agreement. The HealthySMS system is available for use on request.


In this randomised controlled trial, we aim to examine the effect of a smartphone app that uses RL to predict the most effective messages for increasing PA in 276 low-income, ethnic and racial minority patients with diabetes and depression in urban public sector primary care clinics. We will compare this intervention to uniform random messages, delivered with equal and unchanging probabilities, and a control group that only receives a weekly mood message.

Decreasing health disparities

Though the numbers of mHealth pilot studies are increasing in vulnerable populations, many of these fail to follow through with an implementation component to the study design.42 Here, we are using a blended design: while the intervention is in addition to current care, there are ways we are attempting to make it more a part of patients’ clinical care. For instance, patients are mainly approached through primary care health providers, which recommend eligible patients whom they think are directly interested in a PA intervention. In addition, we will make patients’ data available to providers: a summary of the step increase for that patient at the conclusion of the study and updated PHQ-8 and GAD scores entered into the record. Our study therefore is the first step to addressing this gap because of its integration in primary care clinics that serve low-income patients. Future work should focus more specifically on implementation of the app as part of routine clinical care.

PA measure

We chose passively collected daily step count from patients’ preowned digital devices as a measure of PA. Although there are many different ways to measure PA, daily step count seems to be a particularly relevant measure because of: (1) its relative ease to measure, and (2) the clinical importance of individuals’ walking behaviour. Low number of daily step counts have been associated with all-cause mortality in some longitudinal studies43 and results from pooled population studies show clear dose–response effects of PA to overall mortality.44 In patients with type 2 diabetes, several studies have now shown that increasing step counts can significantly decrease HbA1c levels. For instance, a 10 000 steps per day walking prescription increased steps and decreased HbA1c in patients with type 2 diabetes.45 Further, Manjoo et al found that each SD increase in daily steps was associated with a 0.21% decrease in HbA1c.46 However, negative findings have also been reported. For instance, a meta-analysis by Qiu et al found that step-counter use was associated with increased steps per day (over 1800 more steps compared with a control group) among people with diabetes, but not with lowering of HbA1c.47

In exploratory post hoc analyses, we will also be able to examine the more immediate effect of PA messages, for example, on hourly steps in addition to daily steps, which will help to improve future PA interventions (eg, deliver messages at the right times). For instance, it is possible that one could receive a message in the morning and make plans to walk in the afternoon or evening, or messages could have more of an immediate impact. This information is currently unknown.

Personalisation of intervention

The results of this RCT will help us understand if adaptive mHealth interventions for depression and diabetes are more beneficial than interventions that do not use learning algorithms. If mHealth interventions are not personalised, their efficacy might be low, due to low engagement and high drop-out rates.20 The use of machine learning to adapt interventions according to users’ characteristics and behaviours is still in its early stages, but shows promise.48 For instance, Yov-Tom et al using an adaptive learning algorithm found that adaptive feedback messages were more effective in increasing the amount and speed of PA and also reduced HbA1c in sedentary patients with type 2 diabetes.33 Further, Zhou et al showed short-term efficacy of using adaptive weekly step goals determined by RL in healthy patients.49 The current study, with a relatively large group of patients, will further increase our understanding of the potential of machine-learning-driven text-messaging interventions.

Limitations and strengths


First, the results of this study might be specific to this population of low-income ethnic minority patients and might therefore not be generalisable to other populations. However, the inclusion of vulnerable patients in a primary care setting increases the likelihood that this intervention will be effective for other underserved populations. Further, our study procedures do not allow a double-blind design, as researchers and patients need to be made aware of the nature and frequency of messages they are receiving. Additionally, as with all digital interventions, technical issues might arise leading to unreliable step-count data and reduced ability for the algorithm to predict the most effective messages. The researchers and technical personnel will frequently check our server for data collection troubleshooting. Patients will also be made aware when their data are not collected correctly.


Diabetes and depression are among the top 10 causes of disability in the USA.50 Developing cost-effective and scalable models of care for patients with common chronic conditions has been postulated as of key importance in improving the performance of healthcare systems.51 If mHealth apps that target diabetes and depression through their common risk-factor physical inactivity are effective, they can have a major public health impact. Further, because the learning algorithm that we apply in this study is automated and delivers adaptive messages based on patients’ behaviours, it can potentially be applied in other patient populations with a wide range of conditions.

To conclude, the outcome of this trial will provide information on the effectiveness of a text-message-based smartphone app that uses machine learning to increase PA in low-income ethnic minority patients in primary care settings. The results will provide key information on the effectiveness of adaptive mobile applications, compared with more traditional static digital interventions. If effective, this application has the ability to decrease healthcare disparities by providing a type of personalised care to a diverse and traditionally hard to reach group of underserved patients.

Ethics statements

Patient consent for publication


The authors would like to acknowledge Chris Karr who helped develop and manage the automated text messaging service, passive data collection and reinforcement algorithm and Patricia Avila Garcia who was involved in the design and implementation of the user-centred design phase of the study.



  • AA and CAF are joint first authors.

  • Contributors AA and CRL designed the overall study. RH-R, LG-P, JM, AC, US and CF were involved in the design and implementation of the user-centred design phase and RCT. JJW, JM, JA, CF and EY-T were involved in the design of the reinforcement learning algorithm. CF wrote the first draft of the paper. BC, XY and JX revised the statistical analyses and power calculation sections of the paper. All authors revised the manuscript for relevant scientific content and approved the final version of the manuscript.

  • Funding This trial is funded by an R01 to Dr. Adrian Aguilera and Dr. Lyles, 1R01 HS25429-01 from the Agency for Healthcare Research and Quality.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were involved in the design, or conduct, or reporting, or dissemination plans of this research. Refer to the Methods section for further details.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Linked Articles