Article Text

Original research
Systematic review of clinician-directed nudges in healthcare contexts
  1. Briana S Last1,
  2. Alison M Buttenheim2,3,4,
  3. Carter E Timon5,
  4. Nandita Mitra6,
  5. Rinad S Beidas3,4,7,8,9
  1. 1Psychology, University of Pennsylvania, Philadelphia, Pennsylvania, USA
  2. 2Department of Family and Community Health, University of Pennsylvania School of Nursing, Philadelphia, Pennsylvania, USA
  3. 3Center for Health Incentives and Behavioral Economics (CHIBE), University of Pennsylvania, Philadelphia, Pennsylvania, USA
  4. 4Penn Implementation Science Center at the Leonard Davis Institute of Health Economics (PISCE@LDI), University of Pennsylvania, Philadelphia, Pennsylvania, USA
  5. 5College of Liberal and Professional Studies, University of Pennsylvania, Philadelphia, Pennsylvania, USA
  6. 6Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
  7. 7Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
  8. 8Department of Medical Ethics and Health Policy, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
  9. 9Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
  1. Correspondence to Ms Briana S Last; brishiri{at}sas.upenn.edu

Abstract

Objective Nudges are interventions that alter the way options are presented, enabling individuals to more easily select the best option. Health systems and researchers have tested nudges to shape clinician decision-making with the aim of improving healthcare service delivery. We aimed to systematically study the use and effectiveness of nudges designed to improve clinicians’ decisions in healthcare settings.

Design A systematic review was conducted to collect and consolidate results from studies testing nudges and to determine whether nudges directed at improving clinical decisions in healthcare settings across clinician types were effective. We systematically searched seven databases (EBSCO MegaFILE, EconLit, Embase, PsycINFO, PubMed, Scopus and Web of Science) and used a snowball sampling technique to identify peer-reviewed published studies available between 1 January 1984 and 22 April 2020. Eligible studies were critically appraised and narratively synthesised. We categorised nudges according to a taxonomy derived from the Nuffield Council on Bioethics. Included studies were appraised using the Cochrane Risk of Bias Assessment Tool.

Results We screened 3608 studies and 39 studies met our criteria. The majority of the studies (90%) were conducted in the USA and 36% were randomised controlled trials. The most commonly studied nudge intervention (46%) framed information for clinicians, often through peer comparison feedback. Nudges that guided clinical decisions through default options or by enabling choice were also frequently studied (31%). Information framing, default and enabling choice nudges showed promise, whereas the effectiveness of other nudge types was mixed. Given the inclusion of non-experimental designs, only a small portion of studies were at minimal risk of bias (33%) across all Cochrane criteria.

Conclusions Nudges that frame information, change default options or enable choice are frequently studied and show promise in improving clinical decision-making. Future work should examine how nudges compare to non-nudge interventions (eg, policy interventions) in improving healthcare.

  • quality in health care
  • protocols & guidelines
  • health & safety
  • health economics

Data availability statement

Data sharing not applicable as no data sets generated for this study. Given the nature of systematic reviews, the data set generated and analysed for the current study is already available. All studies analysed for the present review are referenced for readers.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • This systematic review synthesises the growing research applying nudges in healthcare contexts to improve clinical decision-making.

  • The review uses both systematic search strategies and a snowball sampling approach, the latter of which is useful for identifying relatively novel literature.

  • Meta-analysis was not possible due to heterogeneity in methods and outcomes.

  • The systematic review was not designed to synthesise research wherein study authors did not identify the intervention as a nudge.

Rationale

Research from economics, cognitive science and social psychology have converged on the finding that human rationality is ‘bounded’.1 The intractability of certain decision problems, constraints on human cognition and scarcity of time and resources lead individuals to employ mental shortcuts to make decisions. These mental shortcuts, often called heuristics, are strategies that overlook certain information in a problem with the goal of making decisions more quickly than more deliberative methods.2 While heuristics can often be more accurate than more complex mental strategies, they can also lead to errors and suboptimal decisions.2 3 Researchers have discovered interventions to harness the predictable ways in which human judgement is biased to improve decisions. These interventions, known as ‘nudges,’ reshape the ‘choice architecture,’ or the way options are presented to decision-makers, to optimise choices.4 Nudges have been applied to retirement savings, organ donation, consumer health and wellness, and climate catastrophe mitigation demonstrating robust effects.5–8

As with retirement savings and dietary choices, clinical decision-making—clinicians’ process of determining the best strategy to prevent and intervene on clinical matters—is complex and error-prone. Clinicians often use heuristics when making diagnostic and treatment decisions.9–11 For example, clinicians are influenced by whether treatment outcomes are framed as losses or gains (eg, doctors prefer a riskier treatment when the outcome is framed in terms of lives lost rather than lives saved).12 Heuristics can lead to medical errors.13 In the face of complex medical decisions, clinicians tend to choose the default treatment option (despite clinical guidelines) or conduct clinical examinations that confirm their prior beliefs.14 15

Choice architecture influences clinicians’ behaviour regardless of whether clinicians are conscious of it, creating opportunities for nudges.16 Clinical decisions are increasingly made within digital environments such as electronic health record (EHR) systems.17 More than 90% of US hospitals now use an EHR.18 19 Researchers have explored the potential to use these ubiquitous electronic support systems to shape clinical decisions through nudges. They have subtly modified the EHR choice architecture by changing the default options for opioid prescription quantities or by requiring physicians to provide free-text justifications for antibiotic prescriptions.16 Even when nudges are not implemented in the EHR, researchers extract aggregate data from the EHR, suggesting its increasing role in the study of clinical decision-making.20

As health systems and researchers have embraced nudges in recent years, there is growing interest in understanding which nudges are most effective to improve clinical decision-making. Taxonomising nudges is advantageous because many nudges explicitly target heuristics, revealing the mechanism of behaviour change.21 If nudges that leverage people’s tendency to adhere to social norms are consistently more effective than nudges that harness clinicians’ default bias, then future nudges can be designed with this insight. Two systematic reviews were recently conducted to evaluate the effectiveness of healthcare nudges. Though both reviews demonstrate promise for the effectiveness of nudges, they offer somewhat conflicting evidence on the most studied and most effective nudge types, suggesting that an additional review may be useful.22 23 Our review offers complementary and non-overlapping insights on the study of nudges in healthcare settings for the following reasons: (1) we do not exclusively study physicians as our target population,23 instead we include all healthcare workers; and (2) we do not restrict our research to randomised controlled trials (RCTs) reported in the Cochrane Library of systematic reviews.22

Our review also makes use of a nudge taxonomy derived from the widely cited Nuffield Council on Bioethics intervention ladder wherein interventions increase in potency and constrain choice with each new rung.24 25 Interventions on the bottom of the ladder tend to be more passive, offering decision-makers information and reminders. Interventions in the middle of the ladder leverage psychological insights to motivate decision-makers either through social influence or by encouraging planning. At the top of the ladder, interventions are more assertive and reduce decisions to a limited set of choices or by creating default options. The nudge ladder categorises nudges by the psychological mechanisms by which they operate, the degree to which they maintain autonomy and have the additional advantage of aligning with existing public health and quality improvement literature that make use of the Nuffield Council ladder.4 26 The nudge ladder offers insights on the heuristics most relevant to the clinical decision-making process and can support health systems in selecting and applying nudges to improve clinical decision-making.

Objective

We systematically evaluated nudge interventions directed at clinicians in healthcare settings to determine the types of nudges that are most studied and most effective in improving clinical decision-making compared with other nudges, non-nudge interventions or usual care. All quantitative study designs were included in our review.

Methods

Protocol and registration

Before initiating this review, we searched the international database PROSPERO to avoid duplication. After establishing that no such review was underway, we prospectively registered our review (https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=123349).

Eligibility criteria

Types of participants

We included only empirical studies published in peer-reviewed journals studying nudges directed at clinicians working in healthcare settings. Clinicians were defined as workers who provide healthcare to patients in a hospital, skilled nursing facility or clinic. Examples of clinicians include physicians, nurses, medical assistants, physician assistants, clinical psychologists, clinical social workers and lay health workers. Studies that exclusively nudged patients were not included.

Types of intervention

Nudges were defined as ‘any aspect of the choice architecture that alters people’s behaviour in a predictable way without forbidding any options or significantly changing their economic incentives’.4 Alterations to choice architecture included changes to the information provided to the clinician (eg, translating information, displaying information, presenting social benchmarks), altering the decision structure of the provider (eg, modifying default options, changing choice-related effort, changing the number or types of options or changing decision consequences) and providing decision aids (eg, offering reminders or commitment devices).27 The study authors did not need to identify the intervention as a nudge to be considered for study inclusion, however given the systematic search string, which includes several behavioural economics terms (see online supplemental appendix 1), studies that did not self-identify as behavioural economic interventions were unlikely to be included.

Interventions that required sustained education or training were not considered nudges. No options could be forbidden and there could be no financial incentives.28 Though some financial incentives for clinicians may be considered nudges, most studies on financial incentives for clinicians involve significant compensation or ‘pay for performance’—of which there is already an existing literature.29

Nudges guided clinicians to make improved clinical decisions, including (but not limited to) increasing the uptake of evidence-based practices (EBPs), adherence to health system or policy guidelines and reducing healthcare service costs. EBPs refer to clinical techniques and interventions that integrate the best available research evidence, clinical expertise and patient preferences and characteristics.30 Study authors had to provide the evidentiary rationale for the nudge.

We did not include studies that analysed the sustainability of nudges in the same study setting and/or sample of providers. In order to analyse studies with independent samples, we included the primary paper and not follow-up papers.

Types of studies

All study designs were included that had a control or baseline comparator—the control or baseline could be usual care or another intervention (nudge or non-nudge). For studies with parallel intervention groups, we did not require that allocation of interventions be randomised (ie, quasi-experimental studies were included). Exclusively qualitative studies were not included. See table 1 for eligibility criteria.

Table 1

Eligibility criteria

Search

Snowball sampling

The initial search strategy was based on a snowball sampling method31 using the references from a published commentary on the uses of nudges in healthcare contexts.16 Reviews identified during the preliminary stage of the systematic search process were also used to snowball articles, though these largely resulted in duplicates. Articles were reviewed at the title level to immediately identify those to be excluded. Those tentatively included were reviewed at the abstract level, followed by the full text for those meeting criteria. Following completion of screening of records retrieved via snowball, a systematic search of several databases was completed.

The methodology for the search was designed based on standards for systematic reviews,32 in consultation with a medical librarian, as well as with two experts from the field of healthcare behavioural economics. The databases used were: EconLit, Embase, EBSCO MegaFILE, PsycINFO, PubMed, Scopus and Web of Science.

Search terms included combinations, plurals and various conjugations of the words relating to identified nudge interventions. The search string and strategy from6 was used as a basis for search terms, but adjusted to reflect our research question (see table 1). All peer-reviewed empirical studies published prior to the completion of our search phase (ie, April 2020) were eligible for this review. See online supplemental appendix 1 for the search strings.

Data collection process

Following retrieval of all records, duplicates were removed using Zotero (www.zotero.org) and via manual inspection. Article screening involved two stages. First, all records were screened at the title and abstract level by a team of four coders (BSL, CT and two research assistants) using the web-based application for systematic reviews, Rayyan (https://rayyan.qcri.org). Criteria in this first-pass screening were inclusive—that is, all interventions directed at clinicians were included. To establish reliability, the coders screened the same 20 articles and then reviewed their screening decisions together. Any disagreements were resolved by consensus. This process was repeated three additional times until 80 articles were screened by all four coders and sufficient reliability was established. Reliability was excellent (Fleiss’ κ=0.96). For the remainder of the screening process, screening was done independently by all four coders; the team met weekly to discuss edge cases. This screening process was followed by a full-text examination to determine eligibility according to more stringent inclusion and exclusion criteria (see table 1). This screening process was done as a team and determinations of article inclusion were decided by consensus.

Patient and public involvement

Patients and the public were not involved in the design, conduct or reporting of this research.

Data items

Study characteristics and outcomes were extracted and tabulated systematically per recommendations for systematic reviews.32 These data included: (1) study characteristics—author names, healthcare setting, study design, country, date of publication, details of the intervention, justification for the nudge, sample size, primary outcomes, main findings and whether the effect was statistically significant; (2) nudge type; and (3) risk of bias assessment.

BSL and RSB trained the coding team (four Master’s students in a Behavioural and Decision Sciences programme) in data extraction. The team coded articles (n=16) together to ensure consensus. RSB reviewed a random sample (n=5) of the final articles to ensure reliability with systematic review reporting standards. BSL subsequently coded the remaining articles (n=18).

Outcomes

We only included studies that included objective measures of clinician behaviour in real healthcare contexts. Studies that measured clinicians’ choices in vignette or simulation studies were not included. Results could be presented as either continuous (eg, number of opioid pills prescribed) or binary (eg, whether physicians ordered influenza vaccinations). Outcomes were measured either directly (eg, antibiotic prescribing rates) or indirectly (eg, using cost to estimate changes in antibiotic prescriptions). Participants could not report on their own behaviour because clinicians’ self-report can be inaccurate.33 Both absolute measurements and change relative to baseline were accepted.

Risk of bias in individual studies

We evaluated whether the studies included in the systematic review were at risk for bias, using the Cochrane Risk of Bias Tool.32 34 BSL trained CT and they assessed articles (n=2) together to ensure consensus. CT independently coded (n=12) articles and BSL coded the remaining articles (n=27). The team met weekly to discuss articles that they were uncertain about and resolved discrepancies by consensus.

Data synthesis

In order to examine which types of nudges were most studied and most effective, we calculated the number and percentage of studies using each nudge intervention according to the nudge ladder (see figure 1). We reported the effect and statistical significance of the effect when a primary outcome was clearly identified in the study. If no primary outcome was identified by study authors, we determined a primary outcome based on the main research question. For studies that reported multicomponent nudges—ie, interventions that combine several nudges together—we reported the total effect of the intervention. For multicomponent nudge interventions, we coded them according to the nudge ladder with all of the nudge types that apply. For studies with multiple nudge treatment groups, we reported the effect of each treatment arm separately. Only nudge interventions were compared with the control arms.

Figure 1

Ladder of nudge interventions. Note, ladder adapted from 24 25. ED, emergency department.

Due to the differences in the exposure, behavioural outcomes and study designs interventions could not be directly compared with one another quantitively using effect sizes.35 Hence, meta-analysis of nudge effects was infeasible. To synthesise the results, we used a vote counting method based on the direction and significance of the effect for each study; caution when interpreting results based on statistical significance is warranted.32 If a simple majority of nudges were significant in a nudge category, the category was deemed effective.

Results

Study selection

The systematic database search identified 3586 entries, which were combined with another 22 articles of interest identified by the snowball sampling method, totaling 3608 articles (see online supplemental appendix 1 for yield). After deduplication of records from the respective databases and snowball sampling techniques, 2486 article records remained. Of the 2486 articles, 2486 articles from the systematic search and snowball method were retrievable and screened in the first stage of title and abstract screening, which reduced the total number of full-text screens to 133 unique articles. Of the 133 articles that were full-text screened, 39 articles20 36–73 met inclusion criteria and the data from these were extracted and evaluated in this review (see Preferred Reporting Items for Systematic Reviews and Meta-Analyses diagram in figure 2).

Figure 2

Preferred Reporting Items for Systematic Reviews and Meta-Analyses flow diagram.

Study characteristics

The characteristics of the included studies are summarised in table 2. The majority (n=35, 90%) of studies were conducted in the USA; two (5%) were conducted in the UK, one (3%) in Belgium and one (3%) in Switzerland. Studies were set in a variety of healthcare contexts (eg, outpatient clinics, primary care practices, emergency departments) and targeted a variety of clinical decisions (eg, opioid prescriptions, preventative cancer screening, checking vital signs of hospitalised patients). Nudges were directed at a variety of medical professionals (including physicians, nurses, medical assistants and providers with a license to prescribe medication). Many (n=20, 51%) of the studies did not report the sample size of clinicians interacting with the nudges. Instead, the studies tended to report the sample size in terms of how many patients were affected by the nudge or the number of prescription or laboratory orders under study. Fourteen (36%) studies were RCTs; 23 studies (59%) were pre–post designs; one study (3%) was a controlled interrupted time series design; and one study (3%) was a quasi-experimental randomised design. In terms of cluster RCTs, four studies (10%) were parallel cluster RCTs and three studies were stepped wedge cluster RCTs (8%). Most studies (n=32, 82%) employed a control group/comparator that consisted of usual care or no intervention. One study (3%) used a minimal educational intervention, another study (3%) examining peer comparison letters used a placebo letter and five studies (13%) employed a factorial design in which multiple combined interventions were tested against individual interventions separately.

Table 2

Study characteristics

Of the 39 studies included in the review, 48 nudges were tested. Some studies contained multiple substudies, study arms or treatment groups, which were coded and analysed separately (see table 3). Given that some interventions (n=5) were multicomponent (ie, combinations of multiple nudges) these studies were analysed separately using the nudge ladder (see table 4).

Table 3

Studies organised according to nudge ladder

Table 4

Multicomponent intervention studies organised according to nudge ladder

Analysing the single component nudges using the nudge ladder, 6 nudges involved guiding choice through default options (eg, changing the default opioid prescription quantity in the EHR); 9 nudges involved enabling choice (eg, electronic prompts to accept or cancel orders for influenza vaccination); 22 nudges involved framing information (eg, peer comparison letters to the clinicians in the top 50th percentile of antipsychotic prescriptions); two nudges involved prompting implementation commitments (eg, displaying clinicians’ pre-commitment letters in their own examination rooms) and four nudges involved providing information (eg, an EHR reminder to clinicians when their patients were due for immunisations). Five studies involved multicomponent nudges, with four studies involving a combination of two nudges and one study involving a combination of three nudges (see table 4).

Risk of bias of included studies

Most studies were at high risk for selection bias including random sequence generation (n=25) and allocation concealment (n=25). Attrition bias was low risk based on incomplete outcome data (n=31). A large number of trials were judged as unclear for selective reporting (n=21). In terms of blinding of participants, most studies were high risk (n=25) and in terms of blinding outcome assessment, 25 studies were judged as having unclear risk of bias. Overall, 13 studies (33%) were considered low risk of bias across all criteria (see table 5).

Table 5

Cochrane risk of bias assessment tool

Synthesis of results

With significance defined as (p<0.05), 33 of the 48 nudges (73%) significantly improved clinical decisions, suggesting that nudges are generally effective. According to the nudge ladder, all six (100%) of the nudges that involved changing the default option to guide decision-making were significantly related to clinician behaviour change in the hypothesised direction. Seven of the nine (78%) nudges that enabled choice led to significant change in clinician behaviour. Fourteen of the 22 (64%) nudges that involved framing information changed behaviour significantly, suggesting their effectiveness. One of the two (50%) nudges that prompted implementation commitments was significantly effective and the other was not. None of the four (0%) nudges that provided information to clinicians resulted in statistically significant results. The five studies (100%) that combined nudges in multicomponent interventions all led to statistically significant changes in the hypothesised direction.

Guiding choice through default options or enabling choice through an ‘active opt-out’ model (ie, active choice) were the most effective interventions in changing clinician behaviour. These nudges also tended to result in the largest effect sizes. Nudges that framed information—the plurality of nudges under study—tended to also change clinician behaviour. The other types of nudges were inconclusive or had more insignificant findings than significant findings. Given that it was infeasible to conduct a meta-analysis to statistically compare the nudge effects and vote-counting is subject to several methodological issues, findings should not be viewed as definitive.

Discussion

Summary of evidence

This systematic review of 39 studies found that a variety of nudge interventions have been tested to improve clinical decisions. Thirty-three of the 48 (73%) clinician-directed nudges significantly improved clinical practice in the hypothesised direction. Nudges that changed default options or enabled choice were the most effective and nudges framing information for clinicians were also largely effective. Conversely, nudges that provided information to the clinician through reminders and prompting implementation commitments did not conclusively lead to significant changes in clinician behaviour.

One strength of the taxonomy organising this review is the ability to explicate why certain nudges are more effective and the mechanism by which they operate. Drawing on the nudge ladder, evidence suggests that less potent healthcare nudges lower on the ladder such as providing information and prompting commitments may be less effective than more potent nudges that are higher on the ladder such as changing the default options. This accords with nudge research in other areas outside of healthcare.74 For example, one study comparing various types of nudges that increase the salience of information (eg, including providing reminders, leveraging social norms and framing information) with defaults found that only default nudges were effective at changing consumer pro-environmental behaviour.8 One large RCT of calorie labelling in restaurants found that posting caloric benchmarks (an informational nudge) paradoxically increased caloric intake for consumers.75

The theoretical reasons for why less potent nudges (ie, nudges at the bottom of nudge ladder) often fail are well established. People have a limited capacity to process information, so providing more data to decision-makers can be distracting or cognitively loading.76 The timing of information is also essential—information is beneficial if it is top-of-mind during the decision.77 Some of the social comparison nudges in this review provided information at opportune times, others did not.43 Additionally, information improves decisions only if existing heuristics encourage errors. Often the information individuals receive may not be new to them. Worse still, informational nudges can have negative unintended consequences. For example, alert fatigue describes when clinicians are so inundated by alerts that they become desensitised and either miss or postpone their responses to them.78 Finally, often reminders and information frames can be insufficiently descriptive in the course of action they suggest, rendering them futile. Given how much of clinicians’ time is spent with the EHR, health system decision supports must be effective and not self-undermining.

More potent nudges (ie, nudges at the top of the nudge ladder) are successful because they act on several key heuristics.79 Defaults leverage inertia wherein overriding the default requires an active decision.80 When people are busy and their attention scarce, they tend to rely on the status quo.81 Moreover, people often see the default option as signaling an injunctive norm.82 They see the default choice as the recommended choice and do not want to actively override this option unless they are very confident in their private decision. It is not surprising that our study found that defaults were effective. It is also not surprising that nudges leveraging peer comparison tended to also be effective at shaping clinician behaviour—clinicians who received messages that their behaviour was abnormal compared with their peers, received a signal that helped them update their behaviour.

Overall, results align with the conclusions of one23 of the two recent systematic reviews of nudges tested in healthcare settings.22 23 Differences in findings may be explained by different search strategies. One of these systematic reviews exclusively searched RCTs included in the Cochrane Library of systematic reviews and found that priming nudges—nudges that provide cues to participants—were the most studied and most effective nudges.22 In that review, priming encompassed heterogenous interventions that span cues that elude conscious awareness, audit-and-feedback and clinician reminders—to name a few—which may account for why study authors found those nudges to be the most numerous. The findings from our review conform with the results of the more traditional systematic review, conducted using a systematic search of several databases.23 The latter review, like this one, found that default and social comparison nudges were the most frequently studied and most effective nudges. However, study authors focused their review on physician behaviour, and our review is more expansive by studying all healthcare workers.

Limitations

Many of the studies in this review included at least some education (ie, a non-nudge intervention) such as a reminder of the clinical guidelines. Because many studies (59%) were pre–post designs, they could not use these brief trainings in a control arm to evaluate the independent effect of the nudge. Therefore, we cannot decisively conclude whether nudges alone are responsible for the changes in clinician behaviour. Similarly, many of the studies (51%) did not report the number of clinicians involved in the study (often reporting the sample in terms of how many patients or laboratory orders were affected by the nudge). Though unlikely, many of the effects could presumably be driven by a small portion of clinicians.

There was considerable variability in how researchers operationalised their primary outcome of interest. The effect of nudges may be contingent on the behaviour under study. One study71 examining changes in opioid prescriptions led to a change in the number of 15-pill prescriptions (ie, the change in ‘default’ orders) but not in the total quantity of opioid pills prescribed, whereas other studies resulted in changes in the total number of opioid pills ordered after an EHR default change. Establishing common metrics would enable direct comparison across studies and would allow us to conclusively determine if the nudge was effective overall at improving clinical decisions.

The considerable number of included papers reporting a statistically insignificant result decreases the usual concern over publication bias, which would skew the results towards desirable and more statistically significant outcomes. The majority of studies (n=21, 54%) were at unclear risk of selective reporting of outcomes (see table 5). Moving forward, the field would benefit from reporting of all experimentation, whether its results are successful, unsuccessful, significant or insignificant. Though not a majority, a large portion of studies (n=12, 31%) were conducted by the same research team in the same health system. To validate that clinician-directed nudges are effective in other settings, other researchers should conduct nudge studies.

Though the nudge taxonomy used in the current review offered a way to classify the nudges described in the studies included, it was not developed empirically. The nudge ladder was developed based on a theoretical understanding of nudge interventions. It is important to understand whether the conceptual distinctions made between nudge types are scientifically reliable and valid.

Future research

Behavioural economics recognises that nudges are ‘implicit social interactions’ between the decision-maker and the choice architect.83 When faced with a nudge, people evaluate the motivations and values of the choice architect as well as how their decision will be understood by the choice architect and others. People tend to adhere to the default option when the choice architect is trusted, well-intentioned and expert. Several non-healthcare default studies backfired when consumers distrusted the choice architect or felt they were nudged to spend more money.84 Clinicians may reject nudges when they perceive health systems’ preferences to conflict with their patients’ interests. Research should attend to how engaged clinicians are in the implementation process and how they make inferences about the motivations and values of the choice architect when interacting with nudges using qualitative methods and surveys.

Nudges are also dependent on how decision-makers believe they will be perceived. For example, around 40% of adults seeking care for upper respiratory tract infections want antibiotics and general practitioners report that patient expectations are a major reason for prescribing antibiotics.85 86 Nudges that attempt to curtail antibiotic prescribing behaviour may shape clinicians’ behaviours in unexpected ways given clinicians’ desire to demonstrate to their patients that they are taking serious action. Subtle features of how nudges are implemented may also influence clinicians’ perceptions of the choice architect, heighten awareness of how their own actions may be perceived and may undermine the nudge. Investigations of clinicians’ choice environments and clinicians’ perspectives using qualitative and survey methods are crucial to the success of nudges.

Future research should also explore how clinician-directed nudges interact with one another in clinicians’ choice environments. In our review, all multicomponent nudge studies (n=5) were effective. However, it is also possible that nudges may crowd each other out when several different clinical decisions are targeted. In addition to alert fatigue, clinicians may experience nudge fatigue and begin to ignore decision support embedded in the EHR. Research should seek to understand how to develop nudges that can work synergistically with one another. Health systems and scientists can work together to understand which guidelines to prioritise and to develop decision support systems within their electronic interfaces that guide providers to make better clinical decisions.

Little work has been done on the sustainability of nudges beyond the study period, with some notable exceptions.87 Particularly for nudges that require continued intervention on the part of the choice architects (eg, peer comparison interventions), it is necessary to also understand their cost-effectiveness. Finally, understanding how nudges can be implemented across health systems is essential given that many of the studies included in this review were conducted in one health system.

Conclusion

This study adds to the growing literature on the study and effectiveness of nudges in healthcare contexts and can guide health systems in their choices of the types of nudges they should implement to improve clinical practice. The review describes how nudges have been employed in healthcare contexts and the evidence for their effectiveness across clinician behaviours, demonstrating potential for nudges, particularly nudges that change default settings, enable choice, or frame information for clinicians. More research is warranted to examine how nudges scale and their global effect on improving clinical decisions in complex healthcare environments.

Data availability statement

Data sharing not applicable as no data sets generated for this study. Given the nature of systematic reviews, the data set generated and analysed for the current study is already available. All studies analysed for the present review are referenced for readers.

Ethics statements

Ethics approval

Given the nature of systematic reviews, no human participant research was conducted for this original research contribution. Thus, the systematic review was not deemed subject to ethical approval and no human participants were involved in this study.

Acknowledgments

The authors would like to thank Mitesh Patel, Anne Larrivee, Melanie Cedrone, Pamela Navrot, and Amarachi Nasa-Okolie for their assistance in the project.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Contributors BSL conceived of and designed the research study; acquired and analysed the data; interpreted the data; drafted the manuscript and substantially revised it. AMB helped design the research study; analysed the data; interpreted the data; and substantially revised the manuscript. CET analysed the data; interpreted the data; and substantially revised the manuscript. NM interpreted the data and substantially revised the manuscript. RSB helped conceive of and design the research study; interpreted the data; and substantially revised the manuscript. All authors approved the submitted version; have agreed to be accountable for the contributions; attest to the accuracy and integrity of the work, even aspects for which the authors were not personally involved.

  • Funding Funding for this study was provided by grants from the National Institute of Mental Health (P50 MH 113840, Beidas, Buttenheim, Mandell, MPI) and National Cancer Institute (P50 CA 244960, Beidas, Bekelman, Schnoll). BSL also receives funding support from the National Science Foundation Graduate Research Fellowship Program (DGE-1321851).

  • Competing interests BSL, AMB, CET and NM declare no financial or non-financial competing interests. RSB reports royalties from Oxford University Press, has received consulting fees from the Camden Coalition of Healthcare Providers, currently consults for United Behavioral Health and sits on the scientific advisory committee for Optum Behavioral Health.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.