Article Text

Download PDFPDF

Process evaluation of the Data-driven Quality Improvement in Primary Care (DQIP) trial: quantitative examination of variation between practices in recruitment, implementation and effectiveness
  1. Tobias Dreischulte1,2,
  2. Aileen Grant3,
  3. Adrian Hapca2,
  4. Bruce Guthrie2
  1. 1Prescribing Support Unit, NHS Tayside, Dundee, UK
  2. 2Population Health Sciences Division, Medical Research Institute, University of Dundee, Dundee, UK
  3. 3School of Nursing and Midwifery, Robert Gordon University, Aberdeen, UK, Scotland
  1. Correspondence to Dr Tobias Dreischulte; t.dreischulte{at}


Objectives The cluster randomised trial of the Data-driven Quality Improvement in Primary Care (DQIP) intervention showed that education, informatics and financial incentives for general medical practices to review patients with ongoing high-risk prescribing of non-steroidal anti-inflammatory drugs and antiplatelets reduced the primary end point of high-risk prescribing by 37%, where both ongoing and new high-risk prescribing were significantly reduced. This quantitative process evaluation examined practice factors associated with (1) participation in the DQIP trial, (2) review activity (extent and nature of documented reviews) and (3) practice level effectiveness (relative reductions in the primary end point).

Setting/participants Invited practices recruited (n=33) and not recruited (n=32) to the DQIP trial in Scotland, UK.

Outcome measures (1) Characteristics of recruited versus non-recruited practices. Associations of (2) practice characteristics and ‘adoption’ (self-reported implementation work done by practices) with documented review activity and (3) of practice characteristics, DQIP adoption and review activity with effectiveness.

Results (1) Recruited practices had lower performance in the quality and outcomes framework than those declining participation. (2) Not being an approved general practitioner training practice and higher self-reported adoption were significantly associated with higher review activity. (3) Effectiveness ranged from a relative increase in high-risk prescribing of 24.1% to a relative reduction of 77.2%. High-risk prescribing and DQIP adoption (but not documented review activity) were significantly associated with greater effectiveness in the final multivariate model, explaining 64.0% of variation in effectiveness.

Conclusions Intervention implementation and effectiveness of the DQIP intervention varied substantially between practices. Although the DQIP intervention primarily targeted review of ongoing high-risk prescribing, the finding that self-reported DQIP adoption was a stronger predictor of effectiveness than documented review activity supports that reducing initiation and/or re-initiation of high-risk prescribing is key to its effectiveness.

Trial registration number NCT01425502; Post-results.

  • Quality In Health Care
  • Primary Care
  • Adverse Events

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • A major strength of this study is the collection of detailed quantitative data on individual practices’ responses to the Data-driven Quality Improvement in Primary Care intervention, which enabled examination of substantial heterogeneity in both the implementation of intervention processes and effectiveness, thereby complementing qualitative case studies examining practitioners’ perceptions of practices’ responses.

  • A limitation of the study is that the sample size was small (33 practices) implying relatively limited power for identifying practice characteristics significantly associated with trial participation, and for examining variation in intervention implementation and outcomes.

  • A further limitation of the study is the use of a questionnaire (to measure the adoption of the intervention by general practices), which has not undergone formal psychometric testing.


High-risk prescribing in primary care

High-risk prescribing in primary care is a major concern for healthcare systems internationally. Between 2% and 4% of emergency hospital admissions are caused by preventable adverse drug events.1 2 The National Institute of Clinical Excellence (NICE) estimated in 2015 that avoidable drug-related admissions in England cost £530 million per year3 and in the USA the combined cost of drug-related hospital admissions, emergency department and outpatient visits was estimated at $19.6 billion in 2013.4 A large proportion of these admissions are caused by high-risk prescribing of commonly prescribed drugs, with non-steroidal anti-inflammatory drugs (NSAIDs) and antiplatelets being the main or among the main drugs implicated, causing gastrointestinal, cardiovascular and renal adverse events.5–7

Data-driven Quality Improvement in Primary Care intervention and trial

In the UK, virtually all primary care prescribing is done by general practitioners (GP). The Data-driven Quality Improvement in Primary Care (DQIP) intervention was systematically developed and optimised8–10 and comprised three intervention components: (1) professional education about the risks of NSAIDs and antiplatelets via an outreach visit by a pharmacist; (2) financial incentives to review patients at the highest risk of NSAID and antiplatelet adverse drug events, split into a participation fee of £350 and £15 per patient reviewed and (3) access to a web-based IT tool to identify such patients and support structured review. The DQIP intervention was evaluated in a pragmatic cluster randomised controlled stepped wedge trial in 33 practices from one Scottish health board, where all participating practices received the intervention but were randomised to one of 10 different start dates.11 Across all practices, targeted high-risk prescribing fell from 3.7% immediately before to 2.2% at the end of the intervention period (adjusted OR 0.63 (95% CI 0.57 to 0.68), P<0.0001). The intervention only incentivised review of ongoing high-risk prescribing, but led to reductions in both ongoing (1.5% at end vs 2.6% preintervention, adjusted OR 0.60 (95% CI 0.53 to 0.67), P<0.001) and ‘new’ high-risk prescribing (0.7% vs 1.0%, adjusted OR 0.77 (0.68 to 0.87), P<0.001) and notably, reductions in high-risk prescribing were sustained in the year after financial incentives stopped. In addition, there were significant reductions in emergency hospital admissions with gastrointestinal ulcer or bleeding (from 55.7 to 37.0/10 000 person-years, relative risk (RR) 0.66 (95% CI 0.51 to 0.86), P=0.002), and heart failure (from 708 to 513/10 000 person-years, RR 0.73 (95% CI 0.56 to 0.95), P=0.02).12

Process evaluation of the DQIP intervention

Descriptions of complex interventions in the research literature often lack details about the context in which interventions were delivered and about the processes of intervention delivery and implementation of individual intervention components.13–16 Such details are however important to decide whether and how an intervention can be implemented in routine care and to inform future research.14 15 We therefore carried out a comprehensive mixed-methods process evaluation alongside the main DQIP trial, based on a cluster-randomised trial process-evaluation framework17 which we developed drawing on the general literature on process evaluation, published process evaluations and the RE-AIM framework.18 Our framework17 emphasises the importance of considering two levels of intervention delivery and response that often characterise cluster-randomised trials of behaviour change interventions (although their relative importance will depend on intervention design). The first is the intervention that is delivered to clusters, which respond by adopting (or not) the intervention and integrating it with existing work. The second is the desired change in care which the cluster professionals deliver to individual patients. Figure 1 shows the application of the framework to the DQIP intervention. In DQIP, the delivery of the intervention to professionals was predefined, intended to be delivered with high fidelity across all practices and under the control of the research team, whereas the intervention delivered to patients was largely at the discretion of practices, who decided whether and how they reviewed patients and whether to stop or otherwise proactively manage high-risk prescribing in those reviewed. This is similar to most health service interventions of this nature, where primary care organisations deliver an intervention to practices (eg, contractual changes, financial incentives, education and so on), but have relatively little control over whether or how practices implement the intended care for patients. We used this framework to structure our parallel process evaluation, mapping data collection to a logic model of how the DQIP intervention was expected to work.17

Figure 1

Framework model for designing process evaluations of cluster-randomised controlled trials applied to the Data-driven Quality Improvement in Primary Care (DQIP) trial.

Focus of this paper

Understanding cluster factors associated with participation, implementation and effectiveness is important to understand how generalisable the findings are and how the intervention achieves (or fails to achieve) intended outcomes. Nevertheless, published evaluations of complex interventions rarely compare trial and target populations and often only report overall intervention impact. Additionally, process evaluations often largely assess participants’ perceptions of how interventions were implemented using qualitative methods, although many processes can usefully be examined quantitatively, providing potentially complementary findings about implementation. In this paper, we report the findings of prespecified quantitative analyses to examine (1) whether and how practices recruited to the DQIP trial differed from non-recruited ones; (2) whether practice characteristics and practice-level response to the intervention delivered to them by the research team (subsequently referred to as adoption) were associated with the way the practice implemented delivering the intended change in care to patients (subsequently referred to as review activity); (3) whether practice characteristics, adoption and review activity were associated with intervention effectiveness at practice level.17


The design and methods of the overall process evaluation have been described previously in the published protocol.17 In brief, the design was a mixed methods parallel process evaluation which examined a set of predefined processes and their associations with change in high-risk prescribing at practice level. The qualitative element consisted of comparative case studies in 10 of the 33 participating practices, purposively sampled to include a mix of those initially responding and not responding to the intervention by rapidly reducing their high-risk prescribing, as judged by visual inspection of run charts approximately 4 months after practices started the intervention. The case-study analysis of how practices adopted, implemented and maintained the intervention,19 and a detailed description of the intervention (including examples of materials used) with analysis of which intervention components were more or less active and when this occurred are described separately.19 20 This paper reports the quantitative element of the process evaluation, which examined associations between practice characteristics, key intervention implementation processes and change in the high-risk NSAID and antiplatelet prescribing at practice level. The study was reviewed by the Fife and Forth Valley Research Ethics Committee (11/AL/0251).

DQIP trial study population

The study population comprised all NHS Tayside practices which participated in the DQIP trial. At practice level, the intervention period lasted 48 weeks from each practice’s randomised start date. At patient level, all patients registered with participating practices were included if they had at least one gastrointestinal, heart failure or renal risk factor that made them particularly vulnerable to the adverse effects of NSAIDs or antiplatelets (as defined by primary and secondary prescribing outcome measures).12 Patients’ membership in the study cohort was dynamic because the presence of risk factors in individuals varies over time due to changing age, morbidity and coprescribing, and ended with the end of the 48-week DQIP intervention period in each practice, deregistration with a practice participating in DQIP, or death (whichever occurred first). Moving of patients between practices participating in the DQIP trial was accounted for by patients exiting and entering the respective practices’ cohorts at the point of registration with a new practice, although this was rare (<0.5% of patients).

Data collection and variable definition

The characteristics of participating and non-participating practices (list size, rurality, socioeconomic deprivation, proportion of older people, postgraduate training status, overall quality and outcomes framework (QOF) clinical performance) were measured using publicly available data.21

Adoption. We conceptualised adoption as the work done by practices to prepare for and begin to deliver reviews to patients, and to ensure effective and sustained review. In order to further elicit whether and how practices responded to the intervention delivered to them by the research team in relation to adoption, the GP most involved in the DQIP work (nominated by the practice) completed two surveys, which were based on the four domains of Normalisation Process Theory,22 namely coherence (whether practices differentiate the intervention from existing prescribing improvement work and understand and value intervention processes and aims), cognitive participation (whether practices engage with and plan to take collective action to implement the intervention), collective action (whether practices integrate the intervention into existing work and avoid disruptions to routine care) and reflexive monitoring (whether and how practices re-evaluate the intervention after attempting initial implementation and whether they modify practice processes if necessary). Given that there are no validated measures of these concepts, and constrained by the resources available, we pragmatically developed two instruments for use in an exploratory analysis (see online supplementary appendix 1 and 2).

In brief, a range of items were created with final item selection based on discussion with a panel of eight GPs and/or normalisation process theory (NPT) experts to confirm content validity,23 and internal consistency was demonstrated for items targeting the same NPT constructs. The first survey was administered shortly after the educational outreach visit at the start of the intervention in each practice and asked respondents about their degree of agreement with statements relating to coherence (six items) and cognitive participation (five items). The second survey was administered 6–9 months after practices started the intervention and asked respondents about their degree of agreement with statements on collective action (three items), and reflexive monitoring (four items) during intervention implementation. Agreement with the statements was measured on a 7-point Likert scale (1=strongly agree, 4=neither agree nor disagree, 7=strongly disagree), with all statements being worded such that lower levels of agreement (higher numerical scores) indicated lower levels of coherence, cognitive participation, collective action and reflexive monitoring, respectively. Both surveys are provided in the appendix material. For each of the four NPT constructs, we transformed the agreement scale to an ‘achievement’ scale, so that the highest level of agreement with statements favouring each construct (agreement score=1) had the highest achievement score (achievement score=6), and the highest level of disagreement (agreement score=7) had the lowest achievement score (achievement score=0). Practices’ achievement was defined as the proportion of achieved over total maximum achievable scores across all questionnaire items that used these scales for each NPT construct.

Review activity was measured in terms of practice level reach, delivery to the patient and maintenance as specified in the published protocol17 (see figure 1). The DQIP IT tool provided data on reach (the proportion of patients flagged as needing a review over the 48-week intervention period, for whom the practice used the DQIP tool to record completion of a review), delivery to the patient (the proportion of eligible patients where there was an active change in treatment or proactive follow-up to determine appropriateness after initial records review) and maintenance (defined as reach in the final 24 weeks of the 48-week intervention period).

Effectiveness was measured using the trial primary outcome, which was a composite of nine individual high-risk prescribing measures targeting high-risk NSAID and antiplatelet prescribing. In order to reduce chance effects (eg, in the context of measuring every 8 weeks, those arising from variable time intervals between prescriptions of a high-risk drug like NSAIDs and those relating to the small number of patients measured in some practices at single time-points), we defined effectiveness in each practice as the relative change in mean high-risk prescribing (measured using the trial primary outcome between the baseline 48 weeks before each practice started the intervention and the final 24 weeks of the intervention period).

Statistical analysis

All analyses were conducted at cluster level. In order to inform judgements about generalisability (objective 1), we compared recruited practices to those who were invited but declined participation. In order to examine variation between practices in the intended review activity (objective 2), we used univariate and multivariate linear regression to examine associations between practice characteristics (list size, whether practices were accredited for postgraduate training, proportion of patients aged 75 years or older, proportion of patients living in the most socioeconomically deprived areas and baseline high-risk prescribing) and the four measures of adoption (coherence, cognitive participation, collective action, reflexive monitoring) with each of the three measures of review activity (reach, maintenance and delivery to the patient). In order to examine variation between practices in effectiveness (objective 3), we used univariate and multivariate linear regression to study associations between practice characteristics, adoption and review activity with effectiveness. Given our hypothesis that DQIP would drive review activity, which in turn would drive effectiveness, strong associations between adoption and review activity variables were expected. In order to explore such mediation effects, we therefore considered two multivariate models. The first only considered review activity variables and the second considered both adoption and review activity variables (in addition to practice characteristics). This allowed us to examine changes in the association between review activity and effectiveness after controlling for DQIP adoption. In all models, continuous explanatory variables were dichotomised as ‘high’ or ‘low’ using the median as a threshold. Multivariate modelling was conducted using stepwise elimination of variables that were not significant in each model at the P=0.10 level. Collinearity between variables was tested by examining the correlation coefficients between each pair of explanatory variables and variance inflation factors (VIF). Where meaningful collinearity was identified between two explanatory variables, only the variable that yielded the better overall model fit was retained in the final model. To account for the small sample size, model fit was assessed using the adjusted R2.24 In sensitivity analyses, we repeated the above analyses without dichotomising continuous variables.


Comparison of recruited vs non-recruited practices

Among recruited practices the median (range) list size was 6167 (1367–11643), the median proportion of registered patients aged 75 years and older was 8.8% (5.9–13.9), and the median proportion of patients living in the most deprived quintile of Scottish postcodes was 2.15% (0.00, 43.8). The median level of high-risk prescribing during the baseline period was 3.44% and varied more than sevenfold between practices (from 0.7% to 5.2%). Table 1 shows that participating practices had on average statistically significantly lower QOF performance than practices who declined participation. Other differences were not statistically significant, although there were also reasonably substantial absolute differences in the proportion of registered patients living in settlements of <3000 inhabitants (30.0% in participating vs 15.3% in non-participating), the proportion of registered patients living in the most deprived Scottish postcode areas (2.2% vs 10.9%) and the proportion of practices accredited for GP training (26.5% vs 43.8%).

Table 1

Characteristics of general practices recruited to the DQIP trial compared with eligible practices which declined to participate

Variation in adoption, review activity and effectiveness

Adoption—response rates to questionnaires. All 33 practices in the trial completed the first adoption questionnaire (assessing coherence and cognitive participation before trial implementation) when they started the intervention, and 29 practices (response rate 88%) completed the second questionnaire (assessing collective action and reflexive monitoring during initial implementation) 6–9 months after they started the intervention. All items were completed in returned questionnaires.

Adoption—coherence and cognitive participation. At the start of the intervention in each practice, the median coherence and cognitive participation scores were 77.8% (range 22.2–100.0) and 80.0% (range 26.7–100.0) reflecting that at the point of DQIP initiation, the majority of responding GPs indicated that they and their colleagues understood the work required (coherence questions A1.3 and A1.4), agreed that DQIP was worthwhile (coherence questions A1.5 and A1.6), expected that DQIP would lead to reductions in targeted high-risk prescribing (cognitive participation question A1.7) and that all prescribers in the practice should and would get involved (cognitive participation question A1.8 and A1.9). However, there were more mixed responses to questions about whether DQIP was different from existing NHS quality improvement work (coherence questions A1.1 and A1.2) (figure 2).

Figure 2

Findings from two adoption questionnaires completed by the general practitioner (GP) leading on Data-driven Quality Improvement in Primary Care (DQIP) in each practice.

Adoptioncollective action and reflexive monitoring. In comparison, at 6–9 months after the start of the intervention in each practice, median levels of agreement with statements relating to collective action (66.7%, range 11.1–100.0) and reflexive monitoring (66.7%, range 16.7–100.0) during initial intervention implementation were somewhat lower. Less than half of practices (n=13) reported frequently using the DQIP IT tool to monitor progress in reducing high-risk prescribing (reflexive monitoring question A2.5), and less than a third (n=9) stated that DQIP had led to changes in practice processes, structures or systems beyond delivering the actual DQIP work (reflexive monitoring question A2.8). Nevertheless, over half of practices agreed that those involved in DQIP had the necessary skill set (collective action, question A2.1, 19 practices), that it was easy to integrate DQIP with existing work (collective action question A2.2, 19 practices) and that patients would respond well to medication changes (collective action question A2.3, 18 practices) (figure 2).

Review activity. The median proportion of patients with high-risk prescribing who had a review documented in the DQIP tool (reach) at any point during the DQIP intervention period was 78.8% (range 0.0%–100.0%), with a similar median reach of 75.0% (0%–100.0%) in the second half of the DQIP intervention period (maintenance). The median proportion of patients flagged as needing a review, who were judged to require proactive follow-up at review (delivery to patient) was 39.7%, ranging from 0.0% to 83.3%.

Effectiveness. The median effectiveness was a relative reduction of 46.6%, ranging from a relative increase in high-risk prescribing of 24.1% to a relative reduction of 77.2%.

Associations of practice characteristics and adoption with review activity

Table 2 shows the coefficients for the associations of the respective explanatory variables with reach, maintenance and delivery to patient. Since all explanatory variables were dichotomised, the coefficients reflect the mean absolute differences in the outcome variable between the two categories of explanatory variables (eg, reach was 20.7 percentage points lower in large compared with small). In multivariate analyses, being a registered GP training practice was the only practice characteristic significantly associated with review activity (delivery to patient), in that being a training practice was a predictor of a lower proportion of patients having their high-risk prescribing changed or actively followed up (despite the fact that baseline levels of high-risk prescribing were comparable between training and non-training practices (3.8% vs 3.5%) and not significantly associated with review activity). With respect to adoption variables, coherence and reflexive monitoring were significantly associated with a higher proportion of patients with high-risk prescribing reviewed over the 48-week intervention period (reach), and reflexive monitoring was significantly associated with a higher proportion of patients having their high-risk prescribing changed or actively followed up (delivery to patient). Collective action was no longer significantly associated with reach in multivariate analysis after adjustment for coherence and reflexive monitoring (in the absence of evidence of significant collinearity (all VIFs<2)).

Table 2

Associations of practice characteristics and adoption with review activity

Associations of practice characteristics, adoption and review activity with effectiveness

Figure 3A ranks participating practices by effectiveness (the achieved relative reduction in NSAID or antiplatelet high-risk prescribing) and figure 3B shows the review activity in these practices. Practices with larger relative reductions in high-risk prescribing appeared to generally have reviewed higher proportions of patients (reach) and identified higher proportions of patients for proactive follow-up (delivery to patient) than practices with smaller reductions, although there were exceptions. For example, the practices ranking 22nd and 23rd for effectiveness, had relative reductions in high-risk prescribing of 19.3% and 19.0%, respectively, despite minimal review activity recorded in the informatics tool (and therefore no payment of financial incentives).

Figure 3

Variation among practices in effectiveness (A) and review activity (B).

In multivariate model 1 (examining the relationship between review activity and effectiveness while controlling for practice characteristics but not adoption variables), only reach remained significantly associated with effectiveness (although maintenance and delivery to patient were also significantly associated with effectiveness in univariate analysis; there was no evidence of significant collinearity (VIF <2)). In multivariate model 2 (when adoption variables were considered additionally), higher coherence and collective action were significantly associated with effectiveness, but reach and other review activity variables were not. Model 2 explained a substantially larger proportion (adjusted R2=0.64) of variation in effectiveness compared with model 1 (adjusted R2=0.33) (table 3). The results of the sensitivity analyses were consistent with the primary analyses findings (see online supplementary appendix 3).

Supplementary file 3

Table 3

Associations of practice characteristics, adoption and review activity with effectiveness


Summary of findings

In this analysis of quantitative data from the process evaluation of the DQIP trial, lower QOF performance was found to be the only practice characteristic that was statistically significantly associated with willingness to participate in DQIP, consistent with recruited practices not being higher performing in general. Across all practices, there was large variation in both recorded review activity (reach and maintenance each ranging from 0.0% to 100%, delivery to patient ranging from 0.0% to 83.3%), and effectiveness (ranging from a relative increase of 24.1% to a relative reduction of 77.2%, although 30/33 practices had a reduction of some kind in the targeted prescribing). With respect to factors explaining variation in review activity, higher adoption was significantly associated with higher review activity (coherence with reach, and reflexive monitoring with all three of reach, maintenance and delivery to patient). Being a training practice was significantly associated with lower review activity (delivery to patient) despite comparable levels of baseline high-risk prescribing in training (3.8%) and non-training practices (3.5%). This was somewhat unexpected since the general expectation is that training practices will have higher quality, and further research to examine this would be of interest.

With respect to factors explaining variation in effectiveness, higher baseline high-risk prescribing was associated with larger reductions in high-risk prescribing, which is likely to reflect larger scope for improvement. Higher review activity was also significantly associated with effectiveness after adjustment for practice characteristics, with practice characteristics and review activity together explaining 38.5% of variation in effectiveness. However, when adoption variables were additionally introduced into the model, review activity was no longer independently associated with effectiveness. The final model comprising baseline high-risk prescribing, coherence and collective action as explanatory variables explained a substantive amount (64%) of variation in effectiveness between practices.

Strengths and limitations

A major strength of this study is that we were able to collect detailed quantitative data on how practices responded to the DQIP intervention (in terms of adoption and review activity) from all participating practices while the intervention was live. This enabled us to examine substantial heterogeneity in both the implementation of intervention processes and effectiveness, thereby complementing qualitative case studies examining practitioners’ perceptions of practices’ responses. However, our study also has some limitations. First, as with most cluster-randomised trials, the sample size was small (33 practices) reflecting that it is driven by the power calculation for the main trial rather than being driven by the process evaluation. This has a number of implications including risk of biased overestimation of explained variation (which we accounted for by reporting an adjusted R2 value) and risk of overfitting of models (as a result of which we dropped several marginally significant (0.05<P<0.1) variables from final models, and our fitted model has at least 10 practices per variable which is adequate in view of simulation studies showing that a minimum of only two SPV may suffice in linear regression for adequate estimation of regression coefficients).24 More problematically, there is relatively limited power for identifying practice characteristics significantly associated with trial participation, and for examining variation in intervention implementation and outcomes. Our decision to dichotomise continuous explanatory variables further reduced power, but as described above we believe it is necessary to enhance interpretability for an improvement audience and our findings demonstrate that variables associated with a 20% or larger absolute difference in review activity and effectiveness could be identified as statistically significant. In addition, sensitivity analyses that fitted predictor variables as continuous yielded findings consistent with the primary analyses (see online supplementary appendix 3).

Second, reflecting limited resources and the lack of any existing NPT instruments at the time of the study, we limited examination of practice characteristics to those that could be measured using publicly available data sources and we took a pragmatic approach to the development and administration of the two NPT-informed adoption surveys. Although face and content validity was confirmed by a panel of NPT experts, time constraints meant that we were unable to conduct psychometric testing. We also pragmatically limited scoring of the questionnaires to the DQIP lead in each practice rather than including all professionals targeted by the intervention. Given evidence of substantial variation in the tendency to issue high-risk prescriptions of NSAIDs between GPs in the same practice,25 it is plausible that perceptions of targeted high-risk prescribing and (by extension) the value of the intervention may differ between different team members. Although the questionnaires also sought perceptions of the participating GP regarding the practice as a whole, future research should aim to elicit how valid such perceptions are. The adoption analysis should therefore be considered exploratory, but it is interesting that scores on the adoption surveys explained a large proportion of variation in effectiveness, indicating the potential for the use of NPT-based surveys for measuring the adoption of complex interventions by clinical teams. In addition, the survey findings are consistent with reported implementation processes elicited by interviews and observations in 10 case-study practices19 (see online supplementary appendix 4). While the instruments used here were study specific, a generic instrument to measure NPT implementation processes has since been developed.26

Supplementary file 4

Third, our measure of effectiveness (change from the baseline period to the second half of the intervention period) is vulnerable to secular trends since the analysis did not account for time trends (although there was no evidence of an overall secular trend in the primary analysis12). Finally, our measures of review activity are all derived from the DQIP informatics tool and some practices may have carried out reviews or other improvement activity without recording it in the tool. Practices only received payment if they completed a review in the tool, which would be expected to make recording more complete, but several practices with large reductions in prescribing recorded few if any reviews. In the main trial analysis, the intervention was also found to significantly reduce new high-risk prescribing.12 This is consistent with the intervention having effects beyond those directly attributable to the review of ongoing prescribing, which was the main target of the intervention and process evaluation measurement. Our measurement of ‘review activity’ is therefore likely to be an incomplete assessment of activity triggered by the intervention. This is consistent with the findings of the case study which identified other work being done in practices beyond that required by the trial team, including work by GPs to minimise the restarting of medicines where there had been an active decision to stop.19

The findings complement the two accompanying process evaluation papers.19 20 Participants perceived that all intervention components delivered to the practice by the research team were active,19 20 which is consistent with the findings of overall high adoption and review activity, and the observation that 30/33 practices had at least some reduction in targeted high-risk prescribing. The quantitative findings are also consistent with the findings of the mixed methods case-study evaluation of intervention adoption and implementation of review activity by practices, in that most of the case-study practices sampled for not having initially reduced their high-risk prescribing were found to actually have delayed implementation and had reduced their high-risk prescribing by the end of the trial.19 The case-study evaluation identifies that there are additional improvement processes happening which are not captured by our measures of review activity, highlighting the value of in-depth qualitative examination of implementation in a sample of practices alongside broader quantitative examination in all practices.19


In order to reduce patient exposure to high-risk prescribing, the DQIP intervention was designed to use education, informatics and financial incentives to activate GPs and practices to review patients with ongoing high-risk prescribing, where documentation of reviews in the DQIP tool was required to obtain payment. We therefore hypothesised that adoption of the intervention (measured in terms of the four NPT constructs coherence, cognitive participation, collective action and reflexive monitoring) would stimulate review activity (measured in terms of reach, maintenance and delivery to patient) documented in the DQIP tool, which in turn would translate into reductions in high-risk prescribing. We found significant associations between adoption and review activity (in both univariate and multivariate analysis) and between review activity and effectiveness (in univariate analysis and in multivariate analysis after adjustment for practice characteristics only), which generally supports the hypothesised mechanism of action. However, the association between review activity and effectiveness was no longer significant after adoption variables had been introduced into the multivariate model and the inclusion of adoption versus review activity variables in the model explained a substantially larger proportion of variation in effectiveness. This was somewhat surprising, because it suggests that DQIP adoption (as defined and measured in this study) impacted on the outcome in ways beyond those primarily targeted by the DQIP intervention (ie, review and proactive management of ongoing high-risk prescribing). Given the relatively small sample size, we did not attempt to conduct a formal mediation analysis27 and these findings should therefore be interpreted with caution. Nevertheless, our results are consistent with the main trial finding of significant reductions in the initiation of high-risk prescribing which was not the primary target of the intervention (in addition to reductions in ongoing prescribing which was the primary target).12 One limitation of the DQIP intervention was its narrow focus on two classes of medicines (NSAIDs and antiplatelets). The combined findings of the main trial and this process evaluation suggest that it may be possible to extend the scope of targeted high-risk prescribing without risking a rebound in the high-risk prescribing targeted here. Whether this is the case and whether facilitating and incentivising review of other potentially suboptimal care more generally lead to changes in sustained changes in professional behaviour is an interesting area for future research.


Despite the large overall reductions in high-risk prescribing observed in the DQIP trial, the findings from this quantitative process evaluation demonstrate substantial variation between practices in terms of their response to the intervention and effectiveness, the latter being partially explained by differences in baseline high-risk prescribing. Our findings suggest that the overall impact of the DQIP intervention was significantly mediated by its impact on new high-risk prescribing or its re-initiation. The adoption processes coherence (whether practices understand and value intervention processes and aims) and collective action (whether practices effectively implement the intervention in the context of existing work) could be targets for further refinement of the DQIP intervention in order to achieve a more consistent intervention effect and further enhance effectiveness.

Supplementary file 1

Supplementary file 2


The authors would like to thank all participating practices and the University of Dundee’s Health Informatics Centre for data management and anonymisation, the Advisory and Trial Steering Groups and Debby O’Farrell who provided administrative support.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.


  • Contributors BG, AG and TD designed the study. AG led the development and initial testing of the adoption questionnaires. TD conducted all statistical analyses supported by BG and AH. TD wrote the first draft of the manuscript with all authors commenting on subsequent drafts.

  • Funding The study was supported by a grant (ARPG/07/02) from the Scottish Government Chief Scientist Office. The funder had no role in study design, data collection, analysis and interpretation or the decision to publish.

  • Competing interests None declared.

  • Patient consent Detail has been removed from this case description/these case descriptions to ensure anonymity. The editors and reviewers have seen the detailed information available and are satisfied that the information backs up the case the authors are making.

  • Ethics approval The study was reviewed by the Fife and Forth Valley Research Ethics Committee (11/AL/0251).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement There are no data available for sharing since the permissions for data use from the NHS organisations which are the data controllers only allow access in a Safe Haven environment by the research team. Further details of the DQIP intervention components, including the IT tool used, are available from the authors on request.