Objectives Intention to treat (ITT) is an analytic strategy for reducing potential bias in treatment effects arising from missing data in randomised controlled trials (RCTs). Currently, no universally accepted definition of ITT exists, although many researchers consider it to require either no attrition or a strategy to handle missing data. Using the reports of a large pool of RCTs, we examined discrepancies between the types of analyses that alcohol pharmacotherapy researchers stated they used versus those they actually used. We also examined the linkage between analytic strategy (ie, ITT or not) and how missing data on outcomes were handled (if at all), and whether data analytic and missing data strategies have changed over time.
Design Descriptive statistics were generated for reported and actual data analytic strategy and for missing data strategy. In addition, generalised linear models determined changes over time in the use of ITT analyses and missing data strategies.
Participants 165 RCTs of pharmacotherapy for alcohol use disorders.
Results Of the 165 studies, 74 reported using an ITT strategy. However, less than 40% of the studies actually conducted ITT according to the rigorous definition above. Whereas no change in the use of ITT analyses over time was found, censored (last follow-up completed) and imputed missing data strategies have increased over time, while analyses of data only for the sample actually followed have decreased.
Conclusions Discrepancies in reporting versus actually conducting ITT analyses were found in this body of RCTs. Lack of clarity regarding the missing data strategy used was common. Consensus on a definition of ITT is important for an adequate understanding of research findings. Clearer reporting standards for analyses and the handling of missing data in pharmacotherapy trials and other intervention studies are needed.
- Statistics & Research Methods
- Clinical Pharmacology
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Strengths and limitations of this study
First study to examine intention-to-treat (ITT) practices in randomised controlled trials (RCTs) of pharmacotherapy for alcohol misuse.
Included a large body of studies in the analyses.
Examined changes over time in data analytic and missing data strategies across nearly 40 years of scientific research.
Findings important for improving reporting practices in RCTs of pharmacotherapy trials for alcohol misuse.
Descriptive analyses could not determine whether there is any relationship between ITT and effect sizes.
In pharmacotherapy trials, participants typically are randomly assigned to a pharmacotherapy or a placebo (control) condition. With a sufficient sample size, randomisation usually produces separate groups without systematic differences by equalising factors within groups that may be associated with outcome (eg, motivation, age and gender). Under ideal circumstances, the randomisation process allows valid causal inferences to be made about the impact of the pharmacotherapy compared to the control condition. That is, one can be highly confident that any post-treatment differences in outcome are attributable to the impact of the medication itself and not to the pre-existing differences in the characteristics of the pharmacotherapy and placebo samples. However, when the randomisation process is disrupted, either through treatment dropout and/or missing data on outcomes, or when the original sample as randomised is not the same sample analysed (analysed N <randomised N), bias may be introduced that compromises the internal validity of results.1–4
The intention-to-treat (ITT) analytic strategy is one solution for eliminating or reducing bias in treatment effects arising from missing outcome data in randomised controlled trials (RCTs).1 ,2 Although no universally accepted definition of ITT currently exists, the procedure nevertheless is endorsed in the Consolidated Standards of Reporting Trials (CONSORT).5–7 One particularly succinct definition of a ‘true ITT’8 analysis is “once randomized, always analyzed.”9 Under this definition, ITT involves the analysis of all trial participants who were randomised, regardless of adherence to treatment protocol (eg, dropout/withdrawal or protocol deviations). In other words, defined this way, ITT requires either no attrition or a strategy to handle missing data.
ITT has several strengths, including (1) helping to preserve the integrity of the randomisation process (ie, groups are expected to be similar except for random variation and receipt of treatment/control condition) and (2) providing a more realistic estimate of average treatment effects in the ‘real world’ as it is the norm for some patients to drop out or not adhere to treatment.1 Both the points above address the issue of patient dropout, as analyses on only adherent patients are likely to lead to inflated estimates of treatment effects. Research has shown that adherent patients generally do better than non-adherent patients, regardless of treatment.10 ,11 The more realistic estimates of treatment effects under conditions of routine care that are derived from ITT analyses have particular relevance for policymakers and those interested in hypotheses of pragmatic (‘real world’) importance.
A variant of the ITT approach, what Polit and Gillespie (2010) term a ‘modified ITT’ analysis, maintains the conditions to which people were randomly assigned and attempts to follow-up all participants, regardless of their participation in the intervention. However, only those successfully followed are included in the analyses. With this modified approach, however, the balance in pre-existing characteristics across conditions sought through random assignment is less likely to hold.
An alternative to ITT analysis, the per protocol analytic procedure (ie, analyses based on only ‘adherent’ participants in randomised samples), has strengths as well and is of particular importance for hypotheses of an explanatory nature.12 The per protocol approach can range from analyses in which only those research participants who began treatment are included, to those in which only participants who received what was deemed a ‘sufficient dose’ of treatment are used, to those in which only participants who fully completed treatment are included (also referred to as a ‘complete cases’ approach;2). Advocates of the per protocol approach assert that the analysis tests the true efficacy of the intervention when used as directed (ie, efficacy among those who are adherent and able to tolerate the treatment).
Since both the ITT and per protocol approaches to RCT analyses have their strengths, a possible strategy is to conduct an ITT analysis, with a per protocol sensitivity analysis to ‘bracket’ the likely effects under different conditions. Nevertheless, ITT analyses are considered the ‘gold standard’ and researchers frequently report the use of this procedure in the published literature, even in the absence of a consensual definition. Discrepancies can arise, however, between the types of analyses that researchers state in research reports that they conducted and what they actually did with respect to the use of a ‘true’ ITT analysis or some other procedure based on less than the full randomised sample. For example, in clinical trials in the nursing field, Polit and Gillespie (2009) found that for 10.5% of studies, researchers who stated they had used an ITT approach had actually conducted per protocol analyses.
It is unknown to what degree ITT strategies are being employed in pharmacotherapy for alcohol use disorders (AUDs). One aim of this review was to determine if there are discrepancies between the types of analyses that researchers stated they used and those they actually used, based on information in reports of a large pool of RCTs of pharmacotherapy for AUDs published between 1970 and 2009. The second aim was to describe the use of different missing data strategies in studies in which true and modified ITT analyses were and were not conducted. The final aim was to determine whether the use of different data analytic approaches and certain types of missing data approaches (eg, multiple imputation) has increased over time while the use of others has decreased.
As part of a larger project examining the efficacy of pharmacotherapies for AUDs and alcohol misuse,13 we identified relevant RCTs via several searches of PubMed and PsycINFO conducted at different points over the past decade. Study inclusion criteria were (1) a focus on treating alcohol misuse or an AUD; (2) participants 18 years of age or older; (3) publication between 1970 and 2009; (4) a report in the English language; and (5) random assignment of at least five participants each to medication and placebo groups. The details of inclusion/exclusion criteria can be found in Maisel et al.13
Searches were intermittent due to sporadic availability of funds and resources. For example, in one search we used search terms for various medications (eg, ‘naltrexone’), terms for alcohol problems and use disorders and alcohol misuse (eg, ‘alcohol*,’ ‘problem drinking’) and terms for RCTs (eg, ‘randomised controlled,’ ‘clinical trial’). This search yielded 1602 potential research reports. Based on the examination of abstracts and, in some cases, full-text versions of these reports, 1184 were identified as not relevant (eg, qualitative studies, reviews). Of the remaining articles, 215 were rejected based on not meeting our eligibility criteria (eg, open-label trial), 138 met the inclusion criteria, but 65 were additional publications for studies already in the dataset (eg, reporting secondary analyses). In addition to the database searches, we perused the reference sections from the reports of the included studies and from previously published reviews of this literature. For the present analysis, a total of 165 studies met our inclusion criteria.
Descriptive and inferential statistics were generated for two categorical variables: (1) sample analysed and (2) missing data strategy. The categories of the ‘sample analysed’ variable were
Full random sample—analyses involved the total randomised N's (with or without imputation or interpolation of missing data).
Full random sample (likely)—analyses appeared to use the full randomised sample, but N's were not reported.
Random sample followed up—attempted to follow-up all randomised participants regardless of the amount of medication/treatment completed and conducted analyses on this sample. Note there is no overlap between categories 1 (‘full random sample’) or 2 (‘full random sample (likely)’) and ‘random sample followed up’.
Sufficient dose—analyses were conducted for only those participants who completed a specified amount of treatment or who received at least a minimum dose of treatment.
Completer sample—analyses conducted for only those patients who completed the medication/treatment phase.
False inclusion—after randomisation, participants were found to not meet the inclusion criteria and were subsequently removed from the analyses.
Other—reported N's or degrees of freedom that were less than what would be expected for the randomised N, but no explanation of the participants included or excluded from the analysis was provided.
Unclear—insufficient information was provided to determine the sample analysed.
Only analyses conducted on the full random sample or full random sample (likely) categories were deemed to be ‘true’ ITT analyses, whereas the others were considered to be something other than ITT analyses.
The categories for the ‘missing data strategy’ variable were as follows
No dropout—no dropout from treatment and 100% reassessed.
All followed—there were dropouts from treatment, but all participants, including treatment dropouts, were reassessed.
Statistical interpolation—used a statistical analysis that interpolated missing data, for example, mixed effects model interpolation.
Failure assumed for missing data (missing=failure)—assumed that missing data reflected poor outcome, for example, relapse.
Baseline assigned—a participant's baseline score was assigned if outcome data were missing.
Last observation carried forward (LOCF)—used the imputation strategy of LOCF.
Censored—last assessment point was used in survival analyses.
Mean—used the mean of the sample followed for missing data.
Other—used some other imputation of missing data strategy.
Sample followed—conducted analyses with data for the sample of participants that the researchers were able to follow/reassess.
Unclear—no or unclear information provided.
Descriptive statistics were generated for data analytic strategies and missing data strategies used in the 165 RCTs of pharmacotherapies for AUD and alcohol misuse. Generalised linear model analyses were conducted to determine changes in data analytic and missing data strategies over time. In those analyses, the response variables, data analytic strategy and missing data strategy were coded as binary (0=‘No’, 1=‘Yes’), with the year of publication as predictor of a ‘Yes’ response.
As noted in table 1, a substantial discrepancy was evident between reporting an ITT strategy versus actually conducting a ‘true’ ITT analysis (ie, reporting an ITT strategy when something other than ITT was conducted). Of the 165 studies included in this review, 74 reported using an ITT strategy. However, less than half of those studies conducted a true ITT analysis (K=29; 39%) according to information in study reports. Interestingly, 35% (K=32) of the 91 studies whose reports made no claim of using an ITT strategy, in fact, did perform true ITT analyses.
Regarding the specific data analytic strategy used, the values in each row of table 1 do not sum to the total number of studies in the first column (ie, ‘Reported Using ITT’) due to the 45 studies utilising ITT and non-ITT analyses (eg, conducted an ITT analysis assuming failure for dichotomous outcomes and also used a complete cases approach for continuous outcomes). In such instances, we coded ‘Reported Using ITT’ as ‘Yes’ if the study mentioned using an ITT strategy and coded it as ‘No’ otherwise (ie, no mention of using an ITT strategy).
The most common approach utilised in studies reporting the use of an ITT strategy, other than use of a true ITT (K=29; 39%), involved analyses of data for participants who completed a ‘sufficient dose’ of the medication/treatment (K=40; 39%). All other strategies were utilised <10% of the time. The most common analytic method used in studies not mentioning an ITT strategy was actually a true ITT analysis (K=32; 29%), followed by analyses of data from completer samples (K=31; 28%), analyses for participants who completed a ‘sufficient dose’ of medication/treatment (K=19; 17%) and indeterminable strategies (ie, Unclear; K=16; 14%).
Table 2 reports the descriptive information on the missing data strategies employed in the studies using and not using a true ITT approach. Similar to table 1, the values in each row of table 2 do not sum to the total number of studies in the first column (ie, ‘Conducted ITT’) due to 42 studies utilising multiple missing data strategies. The most common missing data strategy utilised in studies employing an ITT approach was either unclear (K=24; 23%) or involved censoring data at the end of the follow-up (FU) procedure in survival analyses (K=23; 22%). A study could be categorised as employing an ITT strategy, but having an unclear missing data strategy if, for example, the study reported the full randomised N's from analyses, but it was unclear what particular missing data strategy was utilised. The next most frequently used strategies were assuming missing equals relapse or some other poor outcome (‘Failure’; K=14; 13%) and using a statistical interpolation strategy (K=14; 13%), such as a mixed effects model. All other missing data strategies were utilised ≤10% of the time, except the LOCF procedure that was used in (K=12) 11% of the studies.
The most common missing data method utilised in studies not conducting a true ITT analysis was analysing the sample followed up (K=38; 27%), followed by censoring at the end of the FU procedure (K=25; 18%), assuming failure (‘Failure’; K=23; 17%), LOCF (K=23; 16%) and an unclear strategy (K=17; 12%). All other missing data strategies were used ≤10% of the time. A study could be categorised as not employing an ITT strategy, but still using a missing data strategy of assuming failure or LOCF if, for example, the study assumed failure for missing participants, but something less than the full randomised N's was reported for analyses. Tables 3 and 4 display changes in ITT analyses and missing data strategies over time. No statistically significant change (although a marginally significant trend) was found in the use of true ITT analyses over time (table 3). This relationship is depicted graphically with time on the x-axis, probability (of being an ITT) from generalised linear model results on the y-axis, and raw study values (0, not ITT; 1, ITT) displayed as points. The 95% CIs are displayed as a grey line around the probability slope.
Several statistically significant relationships between missing data strategy and time emerged, as displayed in table 4. Specifically, censored at the end of FU (for survival analyses), LOCF, and using a statistical analysis to interpolate missing data (interpolation, eg, mixed effects model interpolation) have become more common over time, whereas analyses conducted on only the samples of participants that the researchers were able to follow-up (sample FU) have become less common. To explore whether increasing use of certain missing data strategies over time was confounded with longitudinal methods being increasingly employed, a proxy dummy control variable (0, only end-of-treatment assessment; 1, post-treatment and follow-up assessment(s)) was added to the analyses; the results were virtually unchanged.
Across the 165 pharmacotherapy trials included in this analysis, less than half of the 74 studies reporting to have used an ITT strategy actually did so. This finding probably is due, at least in part, to a lack of consensual definition of what constitutes an ITT analysis. In fact, the most common procedure for studies reporting, but not actually using, an ITT involved analyses on participants who completed a sufficient dose of treatment. That is, analyses were conducted on data for only those participants who completed a certain amount of treatment or who received a minimum intervention. This type of analysis is generally considered a ‘per protocol’ approach, which is in contrast to an ITT approach which includes outcome data for all participants, regardless of adherence to treatment.2
Among the studies conducting a true ITT strategy, it was unclear what missing data strategy was used in nearly 25% of these studies. Lack of clarity in journal articles about how missing data were handled makes it difficult for readers to critically assess the study findings. A per protocol analysis answers questions of an explanatory nature, for example, ‘how efficacious is this treatment for those adherent to the treatment?’ In contrast, an ITT analysis provides more realistic (and usually less biased) estimates of the average treatment effects in the ‘real world,’ as it accounts for patient dropout and non-adherence to treatment. If the findings from a per protocol analysis are incorrectly perceived as coming from an ITT analysis, the treatment effects under more routine conditions of care will be overestimated. Journal editors and peer reviewers should be attentive to these issues and request that authors provide a clear description of the sample analysed (ie, ITT, modified ITT or per protocol) in their studies, along with details regarding how missing data were handled.
Since missing data strategies are becoming more sophisticated and are being facilitated by computer technology that is easily able to process data using complex algorithms, the diversity of missing data strategies that are employed is increasing. Indeed, our findings indicate that more complex imputation or interpolation procedures are becoming more prevalent over time. One such imputation procedure is multiple imputation,3 which involves a Bayesian estimation procedure to average outcomes across multiple imputed datasets. Missing data are then replaced with a probable value based on other available variables in the data. Presumably, the results with this approach more closely approximate the results of an ITT analysis with 100% follow-up than any other method of handling missing data that is currently available.
Discrepancies in reporting versus actually conducting true ITT analyses were apparent in this body of alcohol pharmacotherapy trials. Lack of clarity regarding the missing data strategy used was also common. The degree to which these problems are present in reports of trials of pharmacotherapies and psychosocial interventions for other conditions remains to be determined. In addition, consensus on a standard definition of ITT is needed, as are clearer reporting standards for analyses and the handling of missing data in reports of clinical trials.
Contributors ACD was involved in the study design, analysis and interpretation of data, drafting the article and revising it. NCM was involved in the study design and revising the article. JCB was involved in the study design. JWF was involved in the study conception and design, interpretation of data and revising it critically for important intellectual content. All authors gave final approval of the version to be published.
Funding The preparation of this manuscript was supported by the US National Institute on Alcohol Abuse and Alcoholism Grant No. AA008689, and the US Department of Veterans Affairs, Office of Research and Development, Health Services Research and Development Service and Substance Use Disorder Quality Enhancement Research Initiative funded this research. The views expressed are those of the authors and do not necessarily represent the views of the National Institute on Alcohol Abuse and Alcoholism, the Department of Veterans Affairs, or any other U.S. Government entity.
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.