Article Text

Protocol
Protocol for a bandit-based response adaptive trial to evaluate the effectiveness of brief self-guided digital interventions for reducing psychological distress in university students: the Vibe Up study
  1. Kit Huckvale1,
  2. Leonard Hoon2,
  3. Eileen Stech3,
  4. Jill M Newby3,4,
  5. Wu Yi Zheng3,
  6. Jin Han3,
  7. Rajesh Vasa2,
  8. Sunil Gupta2,
  9. Scott Barnett2,
  10. Manisha Senadeera2,
  11. Stuart Cameron2,
  12. Stefanus Kurniawan2,
  13. Akash Agarwal2,
  14. Joost Funke Kupper2,
  15. Joshua Asbury2,
  16. David Willie2,
  17. Alasdair Grant2,
  18. Henry Cutler5,
  19. Bonny Parkinson5,
  20. Antonio Ahumada-Canale5,
  21. Joanne R Beames3,
  22. Rena Logothetis2,
  23. Marya Bautista3,
  24. Jodie Rosenberg3,
  25. Artur Shvetcov3,
  26. Thomas Quinn2,
  27. Andrew Mackinnon3,
  28. Santu Rana2,
  29. Truyen Tran2,
  30. Simon Rosenbaum6,
  31. Kon Mouzakis2,
  32. Aliza Werner-Seidler3,
  33. Alexis Whitton3,
  34. Svetha Venkatesh2,
  35. Helen Christensen3
  1. 1Centre for Digital Transformation of Health, University of Melbourne, Melbourne, Victoria, Australia
  2. 2Applied Artificial Intelligence Institute, Deakin University, Melbourne, Victoria, Australia
  3. 3Black Dog Institute, UNSW Sydney, Sydney, New South Wales, Australia
  4. 4School of Psychology, UNSW Sydney, Sydney, New South Wales, Australia
  5. 5Centre for the Health Economy, Macquarie University, Sydney, New South Wales, Australia
  6. 6School of Psychiatry, UNSW Sydney, Sydney, New South Wales, Australia
  1. Correspondence to Dr Jill M Newby; j.newby{at}unsw.edu.au

Abstract

Introduction Meta-analytical evidence confirms a range of interventions, including mindfulness, physical activity and sleep hygiene, can reduce psychological distress in university students. However, it is unclear which intervention is most effective. Artificial intelligence (AI)-driven adaptive trials may be an efficient method to determine what works best and for whom. The primary purpose of the study is to rank the effectiveness of mindfulness, physical activity, sleep hygiene and an active control on reducing distress, using a multiarm contextual bandit-based AI-adaptive trial method. Furthermore, the study will explore which interventions have the largest effect for students with different levels of baseline distress severity.

Methods and analysis The Vibe Up study is a pragmatically oriented, decentralised AI-adaptive group sequential randomised controlled trial comparing the effectiveness of one of three brief, 2-week digital self-guided interventions (mindfulness, physical activity or sleep hygiene) or active control (ecological momentary assessment) in reducing self-reported psychological distress in Australian university students. The adaptive trial methodology involves up to 12 sequential mini-trials that allow for the optimisation of allocation ratios. The primary outcome is change in psychological distress (Depression, Anxiety and Stress Scale, 21-item version, DASS-21 total score) from preintervention to postintervention. Secondary outcomes include change in physical activity, sleep quality and mindfulness from preintervention to postintervention. Planned contrasts will compare the four groups (ie, the three intervention and control) using self-reported psychological distress at prespecified time points for interim analyses. The study aims to determine the best performing intervention, as well as ranking of other interventions.

Ethics and dissemination Ethical approval was sought and obtained from the UNSW Sydney Human Research Ethics Committee (HREC A, HC200466). A trial protocol adhering to the requirements of the Guideline for Good Clinical Practice was prepared for and approved by the Sponsor, UNSW Sydney (Protocol number: HC200466_CTP).

Trial registration number ACTRN12621001223820.

  • Anxiety disorders
  • Depression & mood disorders
  • Clinical trials
  • MENTAL HEALTH
  • STATISTICS & RESEARCH METHODS
http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

STRENGTHS AND LIMITATIONS OF THIS STUDY

  • The trial uses short-duration interventions designed to improve coping responses to transient stressors.

  • A value of information analysis is included to compare the value of the new trial methods with traditional approaches.

  • Digital phenotyping is used to explore smartphone sensor information with clinical outcomes.

  • More than 12 mini-trials might be required to determine the ranking for the interventions.

  • The methodology assumes that the three-digital interventions are configured to deliver similar doses and/or have approximate fidelity with standard methods.

Introduction

University students experience a disproportionate level and burden of psychological distress compared with both age-matched peers and the general adult population.1 2 Almost half of Australian university students report moderate or high levels of depression, anxiety or stress symptoms,3 and at least two-thirds of students experience subclinical symptoms.4 Prevalence rates are remarkably consistent across university settings3 5 and have remained largely unchanged for the last three decades.1 Psychological distress is not only linked to the development of serious mental disorders, such as major depressive disorder, but is associated with early withdrawal from university study, impaired academic performance, alcohol intake, cigarette smoking, and increased risk of suicidal ideation and behaviour.1

One approach to help target and reduce high rates of psychological distress is to deliver strategies that modify subjective responses to perceived stressors. Meta-analytical evidence confirms the potential usefulness of a range of interventions, including mindfulness-based interventions, physical activity and sleep hygiene, to reduce anxiety and depression symptoms in university students.6–9 However, of the wide range of interventions available for university students, it is unclear which intervention is most effective, whether interventions show differential effectiveness for mild, moderate and severe distress, and whether specific interventions are more effective for individuals with specific symptom clusters of anxiety, depression and stress. In addition, established interventions typically rely on face-to-face delivery and are often lengthy, which means they are resource intensive and difficult to deliver at scale.

Recent evidence shows that delivering interventions via smartphone apps offers a potentially feasible and scalable way to reduce psychological distress in university students and young people more generally.9 Being able to offer short duration, targeted interventions in response to transient stressors, such as examinations or the transition from secondary school to university, may be particularly useful for university students. Although previous studies have included mixed delivery modes, including both face-to-face and digital self-guided interventions,7 none have compared all-digital interventions delivered using participants’ own smartphone devices.

Typically, randomised controlled trials (RCTs) are used to compare the efficacy of different interventions or to compare the effects of an intervention with a control group. While RCTs are considered the gold-standard test of interventions and have led to a large and growing body of evidence supporting different psychological and lifestyle interventions for distressed university students, they also have some limitations. RCTs are often lengthy, expensive and time-consuming to conduct, and are often underpowered to detect group differences, especially when comparing different active interventions. They often offer relatively little information about which individual might respond best to a specific intervention given their symptom severity, profile of symptoms and/or demographics. These challenges underline the importance of looking for new ways to explore treatment efficacy, while preserving the rigour of a traditional RCT.

In Artificial Intelligence (AI)-driven adaptive trials, instead of one large trial, we perform a series of ‘mini-trials’ where the results of each feed into the next.10–12 At each step, AI methods are used to (A) update an underlying model of the effectiveness of the interventions under evaluation and (B) alter the proportion of participants allocated to each intervention in the next mini-trial. Under this scheme, progressively fewer participants are allocated to less effective interventions in later mini-trials. Importantly, the sequence of mini-trials can stop as soon as the estimates of intervention effect become certain; potentially much earlier, and involving fewer individuals than a traditional RCT. In this trial, we will use contextual multiarm bandit, which is a specific type of AI algorithm.12 The aim of the contextual multiarm bandit AI method is to identify the most effective intervention for a group as quickly as possible, to explore the intervention outcomes enough to ensure that one (or more) are not discarded from the trial until the best-performing interventions emerge, and to perform trials to maximise statistical power while controlling false detection rates.

AI-driven adaptive trials promise to provide a quicker, and more efficient alternative to RCTs, particularly when there are multiple potentially effective options, and when it is important to determine which treatment option is best for a particular cohort of people. Compared with RCTs, adaptive trials have been argued to: (1) require fewer participants to estimate the effectiveness of an intervention,12 (2) reach a definitive conclusion earlier so that the best treatment can be offered sooner to the broader population, (3) stop recruitment to futile interventions early and (4) identify interactions between different interventions and different patient subgroups.13–15 Although first discussed over three decades ago, adaptive trials have only recently been introduced in health settings, but have been successfully applied in a cluster RCT of physical activity promotion interventions in general practice.10 To our knowledge, AI-driven adaptive trials have never been used in the mental health context.

In the Vibe Up study, our primary aim is to use AI-driven adaptive trial methods to determine which out of three brief, 2-week digital self-guided interventions (mindfulness, physical activity or sleep hygiene) or an active control intervention (ecological momentary assessment, EMA)16 lead to the greatest reductions in psychological distress in Australian university students. Our aim is to identify the most effective intervention within three separate ‘cohorts’ or ‘groups’: participants with normal/mild distress, moderate distress or severe distress. If this goal is achieved, we aim to identify the second most effective intervention within each cohort. The Vibe Up study will run as a sequence of mini-trials, where participants are allocated to one of the four groups and complete self-report measures to assess how much their distress changed from preintervention to postintervention. Information about outcomes in each of the interventions (or control group) for mildly, moderately and severely distressed participants gained from one mini-trial is used to update the algorithm, which in turn will adjust how many participants are allocated to each of the four interventions in the next mini-trial. In this study, the AI algorithm has two goals: with the smallest number of participants (1) to identify the best-performing intervention within a severity group (mild, moderate, severe) and (2) to maximise the benefits for participants during the trial period.

Our second aim is to test the value of this novel trial approach in the mental health setting. We are specifically interested in establishing whether the AI-driven adaptive trial methodology is an efficient method in comparison to the traditional RCT in determining the effectiveness of the interventions. The trial will encompass an economic evaluation with a Value of Information (VoI) analysis to determine whether the AI-adaptive trial approach yields more value compared with a traditional four-arm RCT. In theory, allocating participants using the AI-adaptive trial approach will result in fewer participants required to rank interventions by their effect size compared with a four-arm RCT strategy. This will reduce trial participant recruitment costs, although administration costs for the AI-adaptive trial approach may differ compared with an RCT. It will also variably affect the confidence intervals around each estimated effect size. The economic evaluation will, therefore, seek to compare the change in decision uncertainty from the AI-adaptive trial approach compared with a traditional four-arm RCT strategy.

The study also incorporates theoretically driven substudies focusing on assessing resilience of the students to negative affect using EMA and exploring the potential for smartphone-based passive sensing of activity (digital phenotyping) to predict changes in distress symptoms.

The primary outcome variable is psychological distress (as measured by the Depression, Anxiety and Stress Scale, 21-item version (DASS-21) total scores) postintervention, relative to preintervention. Secondary outcome variables include changes in physical activity, sleep quality, and mindfulness from preintervention to postintervention.

Based on previous meta-analytical evidence, we expect that the mindfulness and physical activity interventions will be more effective than the sleep hygiene and the active control interventions in reducing overall psychological distress in the sample (as measured by the DASS-21 total scores). We expect to identify differences in intervention efficacy according to baseline distress levels measured by the DASS-21 (mild, moderate, severe). Exploratory analyses comparing intervention effects for symptom clusters of depression, anxiety and stress will be conducted. We expect that the AI-driven optimisation design will reduce decision uncertainty compared with a well-designed four-arm RCT that aims to establish superiority of particular interventions and any differential effects of severity on outcomes.

Methods and analysis

Trial design

This is a sequential RCT using bandit-based response adaptive allocation to compare (on an intention-to-treat basis) the effectiveness of three brief, 2-week digital self-guided interventions and an active control intervention in reducing self-reported psychological distress in Australian university students. The group sequential design will be executed as a sequence of up to twelve, ‘mini-trials’ each recruiting a sample of at least 120 participants. There will be no predefined upper limit for recruitment into each mini-trial. Each participant will be eligible to take part in one mini-trial only, with each trial lasting 4 weeks. Once participants have screened eligible for the study, they will complete a baseline assessment on their smartphone, 1 week of daily EMA, and a second assessment at 2 weeks postbaseline. Participants who do not complete the second assessment will not proceed to the intervention period of the trial. Next, they are allocated by the AI algorithm to one of the three intervention groups, or the EMA control which they receive for 2 weeks. Finally, all participants complete a postintervention assessment at the end of the 4-week period.

Trial timing is shown in figure 1. The expected trial timing and sequence is shown for example participants who complete all activities immediately when prompted (participants A and C) versus for participants who complete activities after a delay but still within the grace period (participants B and D).

Figure 1

The expected trial timing and sequence. EMA, Ecological Momentary Assessment.

A trial protocol adhering to the requirements of the Guideline for Good Clinical Practice17 was prepared for and approved by the Sponsor, UNSW Sydney (Protocol number: HC200466_CTP).

Patient and public involvement statement

The study was conceived and designed by a multidisciplinary research team consisting of clinical psychologists, software engineers, computer scientists and user experience experts. Research questions and outcome measures were derived in consultation with the target population (university students) through one-on-one consultations. Thirty-three university students experiencing psychological distress were involved in this process. They provided feedback on the initial designs of the smartphone app and the appropriateness of the intervention content and language. Individual participants will receive a summary of their well-being status at the end of their participation in the study. Final study results will be aggregated and presented in published manuscripts and national conference presentations. Results will also be published on the study website.

Participants

The trial aims to enrol a total sample of approximately 1200 adult university students with mild to severe psychological distress according to the 10-item Kessler Psychological Distress Scale (K-10) at recruitment but without psychotic spectrum disorders or significant suicidality.

Inclusion criteria

Participants must satisfy the following criteria at screening:

  • Adults aged 18 or older.

  • Currently residing in Australia and planning to be resident throughout their study period.

  • Currently enrolled at a higher education institution.

  • Advanced, fluent or native English speaker.

  • Own an eligible personal smartphone (iPhone 6S/Android 5 or later) with active mobile number and internet access.

  • Self-rated psychological distress on the K-1018 scoring ≥20,19 indicative—of ‘likely to have a mild (or more serious) mental disorder’20—at screening. The K-10 is a widely used, validated tool for assessing psychological distress in adult populations and K-10 scores are strongly correlated with mental illness cases in community samples.21

Exclusion criteria

Participants will be excluded at screening if they:

  • Indicate high levels of self-rated suicidal ideation (scoring ≥21, ‘high ideation’) on the Suicide Ideation Attributes Scale,22 a reliable and valid measure for assessing suicidality in general adult populations.

  • Report a current active diagnosis of psychosis or bipolar disorder.

  • Have already completed a previous mini-trial.

  • Indicate major disruptions or events in the next 2 months which may make it difficult to take part in the study.

  • Indicate plans to travel outside of Australia in the next 2 months.

  • Indicate that they would be unable to safely undertake a physical activity intervention if allocated to receive this treatment.

Participants will not be restricted from receiving any other treatments during the trial but will be discouraged from undertaking new psychological therapies during the 4-week trial period.

Recruitment

The trial will be promoted via targeted paid social media advertisements placed on Facebook and Instagram; platforms identified as effective for research recruitment in young adults.23 24 Advertisement text was developed through formative interviews with university students. Interviews identified two themes to which potential participants might be receptive: improving personal resilience to university-related stressors, and the opportunity to contribute to improved community mental health through digital innovation. Advertisements will be targeted by age and keywords/community interests indicating affiliation to higher education institutions and will link potential participants to a study website containing study information materials and directions on how to complete online self-screening and consent. Reflecting investigator experience with similar studies, the trial will allow up to US$4.50 to be spent on advertising per eligible enrolled participant.

Because the group-sequential design requires a stream of potential participants, performance of the advertising strategy will be monitored. Review will focus on two goals: achieving a consistent sample size per mini-trial; and trying to ensure balanced representation of presenting level of psychological distress and gender (recognising the risk of under-representation of male-identifying participants in youth mental health trials25). Descriptive statistics concerning screening and consent rates in response to different targeting criteria and combinations of advertising text and imagery will be collected and reviewed. Responses to a single-item screening questionnaire indicating where participants heard about the study will also be assessed. Recognising the risk in a group-sequential design that adjustments to recruitment strategy could introduce group-specific biases, we will aim to improve (in a prospective fashion) balanced representation by gender/distress severity rather than trying, for example, to redress a deficit in the overall sample by focusing on recruiting primarily males, or those with very high distress, for a particular mini-trial. As males tend to be more difficult to recruit into studies,26 the adopted strategy separated recruitment of male and female participants into two different campaigns and allocated more budget to social media advertisement aimed at males.

To maximise participation in the study, we also plan to advertise via university organisations, university societies, university staff contacts and releases through traditional media outlets.

A psychological safety response register will be maintained for the trial, with a team of qualified clinical psychologists on call during the trial period. This register will be reported to the human research ethics committee annually. A protocol is in place to log and report any adverse events.

Interventions

The trial will evaluate three brief, self-directed interventions delivered via a software application (app) installed on participants’ personal smartphones (see table 1 for details). Each intervention is entirely separate but designed to be loosely matched on ‘dose’ and required effort over a 14-day period. Each intervention consists of brief modular information covering key concepts, delivered as infographics, structured activities (eg, practising mindfulness with guided audio) and a ‘frequently asked questions’ section including tips, safety advice and answers to common questions associated with that intervention. In addition, an in-app support page is available with contact details of mental health support for participants. Participants are also asked to complete a daily log relevant to each intervention: for mindfulness, time spent practising mindfulness; for physical activity, time spent being physically active; for sleep, hours slept the night prior and for EMA (active control), current affect experience including experience of both positive and negative emotions. Each participant will be allocated to receive one intervention, once. Evidence supporting the selection of the chosen interventions is included as online supplemental appendix A. All interventions are unlocked and available to participants at the end of each mini-trial.

Table 1

Interventions

Mindfulness intervention

Mindfulness is the ‘awareness that emerges through paying attention on purpose, in the present moment and non-judgementally to the unfolding of experience moment by moment’.27 Mindfulness meditation can be instructor-facilitated or self-guided (eg, using a course of guided audio meditations to learn mindfulness techniques28). The mechanisms of mindfulness meditation are distinct from relaxation training29 and include changes in attention, emotion regulation, sensory awareness and self-awareness.30

The Vibe Up Mindfulness intervention consists of an introductory video that includes instructions for bringing mindful awareness to daily activities, followed by a set of five 3–5 min instructor-guided audio meditations that focus on mindful awareness of breathing; noticing, awareness and acceptance of thoughts (‘leaves on a stream’ exercise); attending to bodily sensations (body scan); and mindful eating and walking. Encouragement to adopt a non-judgemental, accepting and self-compassionate response to present moment awareness is weaved throughout the practices. Participants are given the choice of a male or female narrator, both of whom are clinical psychologists experienced in delivering mindfulness interventions. Audio recordings are released in a structured sequence with a 1-day gap between each. Once released, participants can access audio and video recordings as much as they wish.

Physical activity intervention

The Vibe Up Physical Activity intervention starts with an introductory infographic that outlines the benefits of physical activity, Australian guidelines for physical activity, setting realistic goals and incremental change, and practical suggestions for increasing daily physical activity. The intervention was designed in consultation with exercise physiologists who specialise in mental health. The Vibe Up Physical Activity intervention prompts participants to choose a goal each day to increase their physical activity. An evidence-based 7 min high-intensity circuit training protocol is provided as one option for increasing physical activity.31 The trial intervention delivers this as an instructor-led video of a qualified exercise physiologist presenting a series of exercises. Participants are asked to complete some form of physical activity on most days during the 2-week intervention period. The intervention also incorporates infographic-based psychoeducation about the benefits of exercise, how exercise can be integrated into everyday life and anchoring information about expected levels of exercise intensity during the programme. This is presented for review as a structured ‘onboarding’ process at the start of the intervention.

Sleep hygiene intervention

Vibe Up Sleep Hygiene is a brief, self-guided sleep hygiene intervention including basic elements of stimulus control. Sleep hygiene refers to the set of daily living activities that are necessary to maintain good quality sleep and full daytime alertness.32

Although there are recognised common determinants of poor sleep relating to arousal (eg, caffeine ingestion) and sleep organisation (eg, excessive bedtime variation), the activities that influence sleep either positively or negatively substantially vary from individual to individual.32 The purpose of sleep hygiene education is to help individuals identify the specific behaviours and habits that promote their own sleep and implement these, while eliminating/reducing those that disturb sleep.33 For those with sleep problems, stimulus control seeks to reduce anxiety or conditioned arousal associated with going to bed.34

The Sleep Hygiene intervention is a programme of four infographic-based information modules addressing, respectively: the importance of sleep, positive sleep habit formation, creating a sleep-promoting environment, and how daily activities and diet affect sleep. Modules are released in a structured sequence with a 2-day gap between each. As each module is released, participants are prompted to review its contents and identify practical ways to apply it to their sleep hygiene practices. Once released, participants can access module content as frequently as they wish.

Mood tracker active control

The Vibe Up Mood Tracker uses EMA to characterise and quantify the individual profiles of dynamic affect experience among university students with elevated psychological distress. Mood trackers have often been used in conjunction with psychological therapies, including mindfulness-based interventions,35 to help monitor participants’ progress in the treatment. It captures individuals’ subjective experience of emotions to inform the changes in the state of mind and brain function. Previous research indicates that specific rhythms of affect, including ongoing mood instability and persistent negative affect, are strongly associated with the onset and progress of mental health issues such as depression and anxiety disorders.36 37 However, the dynamic change of emotions and its corresponding responses are often neglected in cross-sectional measurements. The thrive in mobile technologies provides a new avenue to detect the affect rhythm via real-time self-assessment, known as EMA.16 Although it is still debatable whether repeated self-assessment itself may alter individuals’ affect experience,38 it is generally acknowledged that self-monitoring has minimal impact on health symptoms in the short term,39 40 thus being used as the active control in the current study.

The Vibe Up Mood Tracker runs on a blended EMA protocol consisting of signal contingent and event-contingent EMA. For the signal-contingent EMA: two daily random prompts will be generated by the study app at a random time within two windows: morning (08:00–10:00) and evening (19:00–21:00) according to the participants’ local time. The participants will have up to 60 min to respond to this prompt, with a reminder sent after 30 min to those not having responded to the initial prompt. For the event-contingent EMA: The participants will be able to log EMA measurements at any time (eg, in response to self-identified exposures to negative stressors). If a participant initiates an event-contingent recording within the 08:00–10:00 or 19:00–21:00 windows, then no signal contingent prompt will be generated within that window, regardless of whether they complete their self-initiated EMA response. Each EMA prompt contains questions regarding the current feelings (positive affect and negative affect) and likelihood of responding to the selected affect(s). Recognising that participants may disclose risk information in their response, there is an annotation in the app noting that ‘We don’t actively monitor responses to this question, but help is always available if you need it.’ plus a link to support options.

Strategies to promote adherence

A formative user-centred design process was undertaken to identify potential strategies for promoting and sustaining engagement with study interventions and tasks. This informed the creation of a simple game-like mechanism, based on the evolution of a virtual character (Sprout), who slowly progresses from infancy to adulthood each day of the mini-trial. Embedded within the app, simple animations representing this evolution are used to provide a sense of delight and reward in response to engaging with the app (see figure 2). In addition, a limited set of text message and email reminders are used at points in the study where there are time sensitive or mandatory tasks such as installing the study app, completing the questionnaires, accessing the interventions after allocation and deadlines to complete these tasks.

Figure 2

Examples of virtual character Sprout evolving.

Participants who complete the postquestionnaire battery will be offered a US$20 electronic gift token and will receive a written personalised summary report of the measurements taken during the study. Participants who additionally complete the follow-up questionnaires (at 8 weeks postintervention period) will be given the opportunity to enter a draw to receive one of three US$35 electronic gift tokens (per mini-trial). The amount of electronic gift tokens is not subjective to the EMA compliance rate in the current study.

Outcome measures

Primary outcome measure

Depression, Anxiety and Stress Scale, 21-item version

The primary outcome measure is self-rated psychological distress according to the total score on the DASS-21. The DASS-21 is a reliable, valid psychometric instrument for the assessment of psychological symptoms via self-administration in non-clinical populations.41 The DASS-21 asks participants to indicate the extent to which each symptom was experienced over the past week, using a 4-item rating scale ranging from 0 (‘did not apply to me at all’) to 3 (‘applied to me very much, or most of the time’). Although consisting of three subscales, addressing depressive, anxiety and stress symptoms, their suitability as a single-factor distress measure combining all subscale items has recently been demonstrated in an adolescent population.42 Higher scores indicate higher overall distress levels.

Secondary outcome measures

Modified Physical Activity Vital Sign

The Physical Activity Vital Sign (PAVS) is a two-item clinical screening instrument for assessing the total time engaged in moderate to strenuous exercise over the past week in adults.43 The instrument has been demonstrated to be valid and suitable for rapid assessment of exercise behaviour.44 The questionnaire assesses the number of days in the past week during which moderate to vigorous exercise was undertaken, and the average number of minutes engaged in physical activity. This allows the calculation of the total minutes engaged in moderate to vigorous physical activity per week, and whether the participant meets the WHO guidelines of greater than 150 min per week.

Modified Pittsburgh Sleep Quality Index Item 6

The Pittsburgh Sleep Quality Index (PSQI) is a reliable, valid 19-item instrument for assessing sleep quality over the previous month in clinical and research populations.45 To manage participant burden, we will use only PSQI item 6 (‘during the past week, how would you rate your sleep quality overall?’) modifying this to focus on the previous week only (to be consistent with the DASS-21 and PAVS time horizons). PSQI item 6 uses a 4-level Likert scale scoring from 0 (‘very bad’) to 3 (‘very good’).

Bespoke mindfulness measure

The mindfulness literature lacks consensus about optimal outcome measures, and measures are often lengthy.46 Therefore, we used a bespoke single-item question to measure mindfulness. Participants were asked ‘mindfulness is a practice where you intentionally focus your attention on what you are experiencing in the present moment, with an attitude of openness and non-judgement. During the past week, how mindful have you been?’. The question’s wording was derived from reviews of existing questionnaires.47 48 Participants will be asked to indicate their response using a 5-level Likert scale from 0 (‘not at all mindful’) to 4 (‘extremely mindful’).

Additional measures

The International Positive and Negative Affect Schedule, Short Form

The International Positive and Negative Affect Schedule, Short Form (I-PANAS-SF)49 is a 10-item scale assessing participants’ subjective experience of positive (eg, inspired), and negative emotional states (eg, upset, afraid). The items were rated on a 5-level Likert scale from 1 (‘very slightly or not at all’) to 5 (‘extremely’). Subscale scores are determined as the sum of item scores, with higher scores indicating higher levels of positive or negative affect. Reflecting on the trial focus on distress, the I-PANAS-SF is extended with two distress-focused items (feeling hopeless or calm) from the K10 to depict the momentary levels of psychological distress. To minimise burden, participants will be asked to choose which of the 12 feelings applied to them at the moment, and then rate the intensity of each of the feelings they selected. Non-selected feelings will be coded as 1 (‘very slightly or not at all’). If a participant selected one or more feeling(s), they will be asked further bespoke questions on their likelihood to do something because of their feelings, using a 4-level Likert from 0 (‘highly unlikely’) to 3 (‘highly likely’.) If a participant responds 2 (‘likely’) or 3 (‘highly likely’), a second question will ask them to describe using free text what it is that they are likely to do.

Further additional measures (as seen in table 2) will be collected either during screening, preintervention or postintervention and used as predictors in models, as mediators of intervention effect and in exploratory analyses. These will assess contextual and perception-related factors that may influence subjective distress or intervention response, such as the availability of social support, socioeconomic status and substance use. They also include measures of subjective attitudes concerning the likely success of the intervention, readiness for change, post hoc perceptions of the interventions and technology experience/barriers to use.

Table 2

Questionnaire measures and administration

Short Warwick Edinburgh Mental Well-being Scale

The seven-item instrument (SWEMWBS) is an abbreviated version of the Short Warwick Edinburgh Mental Well-being Scale, originally designed to assess general mental well-being in adult populations.50 Although SWEMWBS provides less coverage of hedonic well-being and affect, it is sensitive to psychological well-being, has robust measurement properties and is explicitly recommended for general population monitoring.51 Participants are asked to rate a series of statements concerning experiences and attitudes over the past 2 weeks (eg, ‘I’ve been feeling optimistic about the future’) using a 5-level Likert scale ranging from 1 (‘none of the time’) to 5 (‘all of the time’). A total score is derived by summing item scores and transforming using a published lookup table51 with higher scores indicating higher positive mental well-being.

Multidimensional Scale of Perceived Social Support

The Multidimensional Scale of Perceived Social Support is a 12-item reliable instrument with moderate construct validity, which asks participants to rate 12 statements concerning support available from a 3-factor structure of friends, family and significant others (eg, ‘There is a special person who is around when I am in need.’52). Participants are asked to indicate agreement with each statement using a 7-level Likert scale from 1 (‘very strongly disagree’) to 7 (‘very strongly agree’). A total score is generated as the arithmetic mean of item scores, with a higher score indicating greater levels of perceived support. Subscale scores can be generated for each of the three factors but will not be used in this trial.

Subjective Socioeconomic Status Scale

The Subjective Socioeconomic Status Scale is a simple self-anchoring scale that uses a visual metaphor of status—an image of a 10-rung ladder—and asks participants to locate their perceived position on the rungs. The top of the ladder is explained as representing those who ‘are the best off’, having ‘the most money, the most education and the most respected jobs’ while the bottom represents the opposite. The scale yields a score from 1 to 10 inclusive where 10 represents higher perceived socioeconomic status. Subjective socioeconomic status is better correlated than objective measures (such as income) with psychological variables such as stress, negative affect and coping.53

National Institute on Drug Abuse Quick Screen drug screening tool

The US National Institute on Drug Abuse (NIDA) Quick Screen is a four-item instrument54 adapted from a single question screening tool for drug use in primary care.55 The original instrument asks for the number of times that any drug has been used in the past year, while the NIDA tool asks the question separately for binge use of alcohol, any use of tobacco products, prescription drugs being used for non-medical reasons and any use of illegal drugs. Binge use is defined as five or more drinks in 1 day for males or four or more for females. Participants are asked to respond using a five-item scale from: ‘never’, ‘once or twice’, ‘monthly’, ‘weekly’ or ‘daily or almost daily’.

Credibility and Expectancy Questionnaire Items 1 and 6

The Credibility and Expectancy Questionnaire (CEQ) is a reliable six-item scale assessing two cognitive factors concerning belief in an intervention (credibility) and expectation of benefit from its use (expectancy).56 Expectancy appears to be associated with observed outcomes in intervention research.56 57 To manage participant burden, we will use a single CEQ item to assess each of the factors. Item 1 loads on credibility (‘How logical does the therapy offered to you seem?’) and is rated with a 9-level Likert from 1 (‘not at all logical’) to 9 (‘very logical’). Item 6 assesses expectancy (‘How much improvement in your symptoms do you really feel will occur?’) and is rated using an 11-level numeric score from 0% to 100% in 10% increments. Each factor will be evaluated separately.

Revised University of Rhode Island Change Assessment Scale Items 19, 24–26, 29 and 30

The Revised University of Rhode Island Change Assessment (URICA) scale is a 32-item scale originally intended to assess readiness for change during psychotherapy.58 The trial will assess three factors identified in the original scale that are relevant before starting a new treatment: seeking assistance, ambivalence towards change and taking action, by selecting the two URICA items for each factor that accounted for the highest proportion of variance explained in the original study (items 19 and 24 for seeking assistance; items 26 and 29 for ambivalence towards change; and items 25 and 30 for taking action). Each item is a statement relating to change readiness in the current moment (eg, ‘I wish I had more ideas on how to solve my problems.’) and is scored using a 5-level Likert scale from 1 (‘strongly disagree’) to 5 (‘strongly agree’). Items 26 and 29 are reverse scored. A total score is generated by summing the scores with higher scores indicating greater change readiness.

Health economic evaluation measures

Additional measures will be collected for an economic evaluation using a health system and societal perspective. This will include measuring health outcomes and healthcare costs, along with changes in resource use outside the healthcare system. This trial will include an evaluation of productivity changes to capture whether improvements in psychological distress due to the interventions impacted workforce participation.

EuroQol 5-Dimension 5-Level

The EuroQol 5-Dimension 5-Level (EQ-5D-5L) is a generic preference-based health-related quality of life tool used to estimate quality of life and to undertake cost–utility analysis.59–61 It has five descriptive dimensions, including mobility, self-care, usual activities, pain/discomfort and anxiety/depression.62 Each dimension has five levels, including no problems, slight problems, moderate problems, severe problems and extreme problems. Responses to the EQ-5D-5L will be captured at screening and 8-week follow-up, and converted into utilities using an algorithm derived from the Australian general population.63 Quality-adjusted life-years (QALYs) for each participant will be estimated by using estimates of utilities and the area-under-the-curve method.64

Recovering Quality of Life-10

Recovering Quality of Life (ReQoL) 1065 is a preference-based health-related quality of life tool used to estimate quality of life for people with mental health conditions. It has been developed to be more sensitive than generic preference-based health-related quality of life tools when measuring differences in mental health outcomes. It contains 10 mental health items and one physical health item. Responses to ReQoL-10 are captured at screening and 8-week follow-up, and will be converted into utilities using an algorithm derived from the UK general population. QALYs for each participant will be estimated by using estimates of utilities and the area-under-the-curve method.

iMTA Productivity Costs Questionnaire

The iMTA Productivity Cost Questionnaire66 is a self-completed questionnaire that measures health-related changes in productivity. Productivity losses related to mental ill-health will be measured across three domains, including absenteeism, presenteeism and unpaid work.

Use of Mental Health Care Services Questionnaire

The Use of Mental Health Care Services questionnaire was developed specifically for this study to collect information from trial participants on their use of services before and after the interventions. It surveys participants to collect information on the number of services used in the last 4 weeks across five domains, including hospital services, out-of-hospital services, online self-help services, community-based services and medicines.

Additional bespoke questionnaires

There are six additional study-specific questionnaires. These are provided in online supplemental file 1 and are described below. A demographics questionnaire will solicit information about gender, sex recorded at birth, sexual orientation, Aboriginal and Torres Strait Islander status, ethnic origin, and language most used at home. A study and employment questionnaire will ask at screening about international student status; academic performance (reported as current Weighted Average Mark or Grade Point Average) and employment in a paid job concurrently with studies. During the 2-week intervention period, participants allocated to an active treatment will be asked to complete an intervention-specific questionnaire asking how many minutes of, respectively, mindfulness, physical activity and sleep they achieved in the previous day. In addition, three items are included to assess whether participants had prior experience in using well-being strategies.

Postintervention, a within-study exposures questionnaire will ask participants to indicate possible disruptions over the past 2 weeks that may have affected their ability to complete or derive benefit from the intervention in four domains: life or routine disruption, negative impacts on mental health, positive impacts on mental health and problems using the study app. Participants will be asked to rate the extent to which they agree with each of these domains expressed as a statement (eg, ‘In the past 2 weeks, my life or routine was disrupted for some reason.’) using a 5-level Likert scale from 1 (‘strongly disagree’) to 5 (‘strongly agree’).

Finally, a user experience questionnaire, based on the System Usability Scale67 and the mHealth App Usability Questionnaire,68 will ask all participants about subjective perceptions on ease of use, usefulness, satisfaction and technology problems using the study app. For those allocated to an intervention, they will additionally be asked about trust and, separately, novelty of the informational content of the intervention, the extent to which they implemented intervention suggestions in the past 2 weeks and their intentions to do so in the future. As for the within-study exposures questionnaire, all items are expressed as statements (eg, ‘I found the app easy to use.’) and rated using the same 5-level Likert agreement scale.

Digital phenotyping

Passive sensor data will be sampled and collected by the app according to either a predetermined frequency or change in user activity (eg, change in location or walking to running). The passive sensor data collection is contingent on the user granting permissions on both registration and first launch of the app. These permissions may be granted or revoked by the user at their discretion throughout the course of the study. These data entail information about physical movements (dynamic state) of smartphones in space and includes specific data generated from inertial sensors (accelerometry and gyroscope) and from GPS sensors that determine travelled distance, geographical location, user activity (eg, walking, running, driving, as determined by the device Operating System) and step count. These data points can be used to explore associations with changes in anxiety, depression and stress levels within individuals, and to predict DASS-21 at endpoint.

Participants can decline all phenotyping during the trial registration process, meaning the app will not record any such data during the trial.

Study procedure

We are planning to conduct up to 12 mini-trials. Recruitment into each mini-trial will open 1 week prior to the planned start date. Applicants to the study will read the electronic online information sheet and consent form and provide consent electronically during completion of the screening questionnaires on the Qualtrics survey platform. All eligible participants will be invited to install the study app via text message.

On installing the study app and registering via Time-based One-Time Password, participants will be prompted to complete baseline questionnaires and start daily EMA. All subsequent study procedures will be directed via the app. Ten days later, participants will be invited to complete the preintervention questionnaire battery and on completion, will be automatically allocated to one of the three interventions or the active control condition. App-generated prompts will guide participants on how to commence and subsequently undertake the interventions.

The intervention period will last 2 weeks. Outcomes will be measured immediately postintervention and again at an 8-week follow-up. After completion of the postintervention measures at 4 weeks, estimates of the intervention effect will be used to update the multiarm bandit algorithm in time to perform allocation for the following mini-trial. After the postintervention assessments, all study interventions will become available for participants to use for a maximum period of 8 weeks after the trial finishes.

The trial will continue until a significant difference can be ascertained between the AI algorithm-estimated effect sizes of the most effective, and the second most effective interventions (after appropriate adjustment for repeated comparisons) within each severity cohort (mild, moderate, severe distress). If it is not possible to separate our intervention effects in this time, the trial will conclude when twelve mini-trials have been conducted. The rationale for determining the significant difference between the most effective intervention compared with the rest is clear. The rationale for establishing which is the second-best intervention is based on our view that this additional information about effectiveness will be helpful clinically and theoretically. First, it offers a second line of treatment if a person is unable to undertake the first (eg, unable to undertake physical activity) and it provides the opportunity to examine contextual factors that might impact on the two treatments differentially. In typical non-adaptive trials, a series of planned or post hoc analyses are frequently used to compare mean differences between different intervention types and compared with the control group. Because we seek to detect a significant difference between the best intervention and the others, and the second best intervention and those remaining, the number of mini-trials initiated will be determined by the performance of the AI algorithm in learning the differences in intervention effect between each intervention.

Randomisation/blinding

Computerised allocation will be performed automatically for participants who complete the baseline questionnaires. In the first mini-trial, an allocation ratio of 1:1:1:1 (mindfulness, physical activity, sleep hygiene, EMA control) will be used. Subsequent allocation ratios will be determined by the multiarm bandit algorithm. There will be no minimum per-arm allocation implying that poorly performing arms can be dropped. Participants’ group allocation will be revealed to them within the app after they complete the preintervention questionnaire battery (after the EMA period).

Participants and operational staff involved in day-to-day participant management will be unblinded because the nature of the interventions mean that they cannot easily be concealed. All other investigators and trial staff will be blinded. Allocation concealment will be guaranteed by preventing access by blinded study staff to the computer system holding randomisation information; and breaking randomisation codes only once primary data analysis is complete (or at the request of the data safety monitoring board). Intervention allocation codes will be generated and retained automatically by the computer system performing allocations.

Multiarm bandit algorithm

For the Vibe Up trial, the specified optimisation problem is to identify, with the smallest number of mini-trials/participants, the best performing intervention arm. This means that we can reject the null hypothesis that there is no difference between the best performing intervention and the other three groups in the preintervention to postintervention change scores on the DASS-21 Total. If this is successful within 12 mini-trials, the problem will be reformulated to try to establish the next best-performing intervention within the remaining trials. Assessment of whether optimisation goals have been satisfied will be made offline as part of interim analyses conducted after mini-trials 4, 8 and 12 (outlined in detail below).

The bandit algorithm used in the trial will have the following technical properties. Intervention effects will be modelled using Gaussian Process regression with zero mean function and squared exponential kernel, with baseline normalised DASS-21 score as the sole independent variable (capturing ‘context’) and within-individual pre-post DASS-21 change score as the dependent variable, treating both as continuous and real valued. Change scores will be used here—but not for the main trial analyses—for consistency across models by ensuring that a value of 0 implies no effect (this ‘contextual bandit’ regression approach will mean that severity-contingent effects can also be estimated and compared for the interventions). After mini-trial one, which will use a fixed allocation probability of 0.25 per arm, an upper confidence bound (UCB) acquisition function will be used to deterministically allocate participants to the modelled best-performing intervention given their baseline DASS-21 severity level. UCB uses a statistically rigorous scheme to balance two goals, exploitation (to maximise allocation of apparently ‘good’ intervention arms) and exploration (to collect data about all arms to improve our knowledge about their goodness).

Sample size

The trial will recruit at least 120 participants in each of up to 12 mini-trials. To allow for attrition between screening and mini-trial commencement, recruitment for each mini-trial will continue until at least 120 individuals have been screened eligible. Assuming that up to one-third of participants do not respond to the subsequent invitation to install the study app and complete baseline questionnaires, this will yield at least 80 participants starting the mini-trial (ie, 20 per arm, assuming the mini-trial 1 allocation ratio of 1:1:1:1). Attrition after baseline of up to 20% is allowed, resulting in expected completion of postintervention assessments of 64 (ie, approximately n=16 per arm).

Since adaptive trials are usually more sample-efficient than fixed trials, we used fixed trials to calculate the total sample size, which is expected to serve as an upper bound for our adaptive trial. The calculation of the total sample size (aggregated over mini-trials) was based on a priori effect size for each intervention (obtained from the literature), the type-1 error rate of 0.05, and a power of 0.8 under conditions of multiple hypothesis testing.

Statistical analysis and data management

Analysis of primary outcome

For analysis, participants will be categorised into one of three clinically relevant severity groups according to their baseline DASS-21 total score (following the procedure outlined in the DASS manual): normal/mild symptoms, moderate symptoms, severe/extremely severe symptoms.69 This approach will allow exploration of whether the most effective intervention(s) differ according to severity.

Mixed model repeated measures analysis of variance70 will be used to compare the four groups. Analysis of the primary endpoint will be based on a modified intention-to-treat analysis strategy, under the assumption that missing data are missing at random: participants must have downloaded the app and completed both the baseline DASS-21 and preintervention DASS-21 assessments to be included in the intention-to-treat sample. An unconstrained variance–covariance matrix will model within-individual dependencies. Satterwhaite’s method71 will be used to adjust df. For each group, planned contrasts will compare the difference in self-reported psychological distress from preintervention to postintervention (or control period) as measured by the DASS-21 total score in each mini-trial. Any required transformation of scores to meet distribution assumptions of analysis will be undertaken with results from transformed data forming the basis of judgements of statistical significance. Choice of transformation will be made on review of the data from the first mini-trial and be used in all severity groups and all subsequent mini-trials. A significance level of 0.05 will apply to tests conducted in the mini-trials.

Interim analyses

We will conduct interim analyses three times during the trial period, after mini-trials 4, 8 and 12, using the full available data in the intention-to-treat sample. These analyses will determine whether a particular intervention is more effective in reducing distress from preintervention to postintervention, relative to the other groups, within each clinical severity group (ie, mild, moderate, severe). Once an intervention has been found to be the most effective within a severity group, it will be removed from the list of interventions available for recommendation by the optimisation algorithm for the remaining mini-trials. This allows the optimiser to focus its allocations for the rest of the mini-trials to the remaining interventions (or control group).

For example, if the interim analysis reveals that for participants with severe distress, the physical activity intervention is more effective than the other three groups (mindfulness, sleep, EMA), it will be removed from the list of possible interventions that participants with severe distress can be allocated to in the remaining mini-trials. This will enable the optimiser to determine the next best intervention for people with severe distress between the three remaining groups.

The intention of this approach is that at a future interim analysis, comparisons can be made to find the second most effective treatment within the severity group. The process is repeated until the second-best intervention is identified, and the third most effective and so on. Although attempts will be made to rank the effectiveness of the four groups from most to least effective within each severity cohort, there is no guarantee that the full ranking will be complete by the end of the 12 mini-trials.

Comparisons of multiple treatments at multiple time points can lead to inflation of type 1 error (false positive) rates, which need to be corrected for. To control for type 1 errors that arise from sequential hypothesis testing, an alpha spending function will be applied to distribute the type 1 error across the three interim analyses. Various spending functions exist such as Pocock, O’Brien-Fleming, Demets and Lan.72 In our study, we will conduct the first hypothesis test at about 33% of the way during the experiment period (after mini-trial 4), the second at about 66% of the way (after mini-trial 8) and the third after the final mini-trial (100% of the way through). The information fraction for the alpha spending function is the fraction of participant data so far, compared with the expected total number of participants over the experimental period. Appropriate adjustments to the critical alpha spending p value will be made to ensure the cumulative type 1 error is maintained at 0.05. In addition, to control the increased type 1 errors due to multiple hypothesis tests, we will apply the Benjamini-Hochberg correction in the critical p values.73 This means that the alpha spending p value is adjusted for each of the multiple tests. For example, in the first interim test, there will be four treatments to compare for each cohort. This totals six comparisons (done via t-tests). As such, the alpha spending p value for this test is adjusted according to the Benjamini-Hochberg method.

At each interim analysis, a multiple hypothesis test is conducted (separately for each severity group) to determine whether, for the currently active interventions, any of them is significantly better at improving the DASS score than the other groups.

For a treatment to be removed from the list of available treatments for the severity group (and be deemed the most effective treatment option), it must emerge as significantly better (one-sided Welch’s t-test with Satterthwaite-adjusted df) in pairwise comparisons between it and every other active intervention or control group, within the cohort. Significance is specified as returning a p value less than the critical alpha spending p value (with the Benjamini-Hochberg p value adjustment).

Additional analyses

Mixed effect logistic or Poisson regression models will be used to assess if baseline psychological distress and suicidal thoughts/behaviour are associated with engagement with EMA. The effect of time-varying responses to feelings, momentary affect and changes in self-reported psychological distress and suicidal thoughts and behaviour will also be explored using mixed effects regression models. The relationship between momentary affect, psychological distress, exercise and sleep quality at a given time interval will be assessed using mixed effect regression models. Descriptive analyses will be used to examine compliance and reactivity of the EMA.

Machine learning will be used to analyse digital phenotyping data to: (1) explore whether any novel behavioural factors predict the study primary endpoint and (2) investigate within-individual behavioural signals that predict individual changes in self-reported distress or affect measured using EMA.

Raw data collected from sensors will first be preprocessed and then feature extraction techniques will be applied. Parts of these data are high dimensional, such as acceleration and angular acceleration, and therefore, signal processing techniques will be used to extract low-level features. The data will then be investigated both separately and in conjunction with survey responses and EMA results to identify and select an optimal set of variables for machine learning algorithms. This will allow us to develop predictive models of participants’ mental health state(s) across various stages of the study.

In terms of economic evaluation of potential benefits of running AI-adaptive trials, the expected value of this methodology is the potential reduction in the probability of making a wrong funding decision (ie, the intervention is not the most cost-effective intervention) multiplied by the average consequence of being ‘wrong’ (ie, how resources may be better allocated). This benefit is compared with the cost of the trial itself. If the expected benefit exceeds the expected cost, then there is a net gain to using the AI-adaptive trial.

A VoI analysis will be used to estimate if the bandit-based adaptive trial represents better value than conducting a traditional RCT.74 This will be conducted ex ante to assess whether conducting the AI-adaptive trial would be more valuable compared with an RCT before it is conducted, and ex post to assess whether the AI-adaptive trial was more valuable in terms of reducing the need for further research compared with an RCT.

The primary challenge with this economic evaluation is that a traditional RCT is not being run alongside the AI-adaptive trial, so the uncertainty reduction in the intervention rankings and differences in trial costs cannot be directly compared. For the ex ante VoI, the mean and uncertainty (SEs) surrounding the estimates of QALYs and costs experienced with each intervention and trial design will be estimated using reported intervention effectiveness in the published literature, combined with estimates of the sample size required to conduct an RCT to show significant differences in the treatment effects between interventions. For the AI-adaptive trial ex post analysis, the mean and uncertainty surrounding the estimates of the mean QALYs and costs experienced with each intervention will be based on the results of the AI-adaptive trial. For the RCT ex post analysis, the expected mean and uncertainty surrounding the estimates of the mean QALYs and costs that would have been experienced using a traditional RCT will be estimated using data collected in the first mini-trial.

Costs will include those associated with developing the AI-adaptive trial algorithm (for the AI-adaptive trial arm only), analysis, participant recruitment, app dissemination, app maintenance and hosting, students’ productivity, students’ mental healthcare services use and the time of participants using the app.

The expected value of perfect information (EVPI), expected value of sample information (EVSI) and expected net benefit of sampling (ENBS) would be estimated using analytical methods.72 75 76 A willingness-to-pay threshold of US$50 000/QALY gained will be assumed.77 We will estimate the population size that may benefit from the app based on the inclusion criteria. The VoI will be conducted on a time horizon according to the assumed life expectancy of the intervention based on estimates of the duration of time until the app needs to be redeveloped.

If the EVPI is inferior to the potential cost of research, then there is no value in conducting new research, and the VoI analysis can stop. If the EVPI is superior to the research costs, then EVSI will be estimated and compared with the trial costs to compute the ENBS for the AI-adaptive trial and the hypothetical RCT. If the ENBS from a traditional RCT is estimated to be less than the ENBS for the AI-adaptive trial, the latter will have produced a net societal gain, supporting evidence for its use.78

Data management

All trial data will be collected electronically using online questionnaire software and the Vibe Up app, which will transfer collected data automatically to a cloud database. To avoid accruing data plan costs for participants, data will, by default, be transmitted to the research team only when each mobile device is connected to a Wi-Fi network. To ensure that no data are lost, the app will securely store collected data until it has been successfully transmitted to the server. After collection, all data will be transferred on a scheduled basis by the research team to a secure network drive for storage and backup. A combination of technical and procedural access controls, documented in a research data management plan, will be used to restrict access to data and specify the purposes for which it can be used. Participant identifiable details, such as contact information, will be held separately from other trial data. Identifiable information necessary for the administration of the study and/or participant safety follow-up, such as contact details and participant follow-up records, will be accessible only to named members of the research teams involved in participant administration/safety responses. Identifiable information will not be used for any study analysis. On completion of this project, all information will be retained for 15 years. Procedures for data archival and destruction will follow the then-current UNSW Sydney procedures and the Australian Code for the Responsible Conduct of Research.79

Ethics, oversight and dissemination

Ethical approval was sought and obtained from the UNSW Sydney Human Research Ethics Committee (HREC A, HC200466). The trial Sponsor is UNSW Sydney. The approved protocol adheres to the Guideline for Good Clinical Practice.17 Please refer to online supplemental file 2 for the Participant Information Statement and Consent Form (PISCF). The trial is registered with the Australian New Zealand Clinical Trials Registry (ACTRN12621001223820). Any protocol amendments will be subject to approval by the ethics committee and will be recorded in the ANZCTR registry. Annual reports of study progress will be submitted to the HREC and Sponsor.

The study design and intervention materials were developed in conjunction with those with lived experience of anxiety and depression. The study oversight includes a stakeholder and advisory board, which comprises psychiatrists, clinicians, lived experience leaders, researchers, data scientists and service providers.

A participant safety response procedure was established to address potential psychological safety risks, including elevated suicidal ideation at initial screening or other unprompted disclosures (eg, to the study email account) of significant distress. Under the protocol, risk disclosures will be managed by offering a phone call with a study clinician and a range of self-referral options, such as consulting a general practitioner and links to crisis support services. Oversight of any psychological safety events, any other adverse events and progress towards trial outcomes will be provided by an independent data safety monitoring board. The board will meet regularly throughout the trial and is specifically required, if necessary, to make recommendations to the sponsor on whether to continue, modify or stop the trial.

Access to the full study protocol and associated written procedures is available on reasonable request. Details of the specific implementation of the bandit algorithm will be available on request once the study is complete. Access to participant-level data will be subject to the governance procedures of a planned data repository, accessible to researchers and non-commercial users, which will contain the data arising from this study. Study source code, including that of the study app and trial administration platform, is not publicly available.

Study findings will be disseminated principally via peer-reviewed publications in scientific journals and via academic conference presentations. Information materials and a dissemination programme will be developed to share learnings around bandit-based response adaptive randomisation with potential clinical research users. Data used in academic outputs and training materials will be in aggregate form only. There are no funding-related restrictions on how study information can or will be disseminated.

Ethics statements

Patient consent for publication

Acknowledgments

We would like to thank Adryon Joubert-de Villiers, Jenny Vuu, Caroline Fitzgerald, Matthew Slarke, Marya Bautista, Vera Kravchuk, Matt Lee, Dean Winder and Sandrine Chevassu for their significant contributions to the design and development of the study app and digital intervention content. We are also very grateful to Debbie Agnew and Zoe Jenkins for their advice on social media recruitment strategy and content, and to David Jung for helping secure data governance approvals. We would also like to thank the university student organisations and counselling services for their feedback on design and intervention content.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Twitter @alizaws

  • Contributors SV, KH and HCh conceived the study and secured funding with assistance from JH, JRB, AW-S, JMN, RV, KM, SG, SRa, TT, TQ and HCu. KH led development of the study protocol and wrote the first draft of this manuscript. ES, WYZ, JH, JMN, AS, JR, AW, LH, RL, TQ and SG played key roles in the design and development of study processes. ES developed the intervention designs and content with support from MB, JRB, JH, SRo and JMN. JH devised the ecological momentary assessment study component. LH led development of the technical platform, with support from RL, SB, RV, SC, SK, AA, JFK, JA, DW, AG and KM. SG and MS led development and implementation of the bandit algorithm, with support from SRa, SV, TT and AM. HCu, BP and AA-C specified the economic analyses. All authors provided critical input throughout the development of the protocol. All authors reviewed the manuscript and provided comments. All authors approved the manuscript prior to publication.

  • Funding This work was supported by Commonwealth of Australia Medical Research Future Fund grant (MRFAI000028) Optimising treatments in mental health using AI. HCh is funded by an NHMRC Senior Principal Research Fellowship (GNT1155614). JMN is funded by an NHMRC Investigator Grant (GNT2008839). AW is funded by an NHMRC Investigator Grant (GNT2017521).

  • Disclaimer The funding body had no role in any aspect of the study design or this manuscript.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were involved in the design, or conduct, or reporting, or dissemination plans of this research. Refer to the Methods section for further details.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.