Short communication
Clinical trials and the response rate illusion

https://doi.org/10.1016/j.cct.2006.10.012Get rights and content

Abstract

Clinical trial outcome data can be presented and analyzed as mean change scores or as response rates. These two methods of presenting the data can lead to divergent conclusions. This article explores the reasons for the apparently divergent outcomes produced by these methods and considers their implications for the analysis and reportage of clinical trial data. It is shown that relatively small differences in improvement scores can produce relatively large differences in expected response rates. This is because differences in response rates do not indicate differences in the number of people who have improved; they indicate differences in the number of people whose degree of improvement has pushed them over a specified criterion. Therefore, patients classified as non-responders may have shown substantial and clinically significant improvement, and these are the patients who are most likely to become responders when given medication. Response rates based on continuous data do not add information, and they can create an illusion of clinical effectiveness.

Introduction

There are various ways of comparing the performance of drug and placebo in clinical trials and systematic reviews of those trials. Outcome data for conditions that are assessed on continuous scales are typically presented as mean change scores (or sometimes as post treatment scores adjusted for baseline scores). Categorical results can also be calculated from continuous data, in the form of response rates or improvement rates (i.e., the percent of patients deemed to have responded or improved) and statistics derived from them including odds ratios, relative risk ratios, and number needed to treat. These categorical outcomes consist of the proportion of people who meet a (usually) predefined level of improvement or fall below a predefined threshold score on a continuous measure. They do not reflect natural categories, but are simply a way of dividing up a continuous distribution.

Ideally, these different methods of assessing outcome should produce similar conclusions, since they are derived from the same data. In practice, they can be divergent. This is particularly striking in reviews of antidepressant medication, where mean improvement scores indicate very small drug-placebo differences [1], [2]. whereas response rate data suggest that these differences are more substantial [1], [3]. The purpose of this article is to explore the reasons for the apparently divergent outcomes produced by comparisons of mean improvement and response rates and to consider their implications for the analysis and reportage of clinical trial data. Although our concern extends to all clinical trials in which data derived from continuous scales are reported as response rates, we illustrate the issue with data from the antidepressant literature.

Section snippets

Response rates and continuous distributions

Response rates depend on the criterion used to define a response, as well as on the magnitude of drug and placebo effects. When a response rate of 50% has been found, as has been reported in the published antidepressant trial data [3], it means that the criterion that had been chosen for defining a response has coincidentally turned out to be the median of the distribution of improvement scores. This is true regardless of the shape of that distribution. In normal distributions, it is also the

How the response rate illusion is produced

The response rate illusion is not due to response rate statistics, but rather to our interpretation of them. We think of a responder as a person who improves and a non-responder as a person who does not improve, but this is not necessarily accurate when response to treatment has been derived from continuous scores, rather than defined by natural categories (e.g., survival). A patient who is classified as a non-responder on the basis of a criterion of improvement on a continuous scale, may have

Conclusions

Some outcome data (e.g., death and pregnancy) can only be expressed in terms of response rates. Other outcomes do not fall into natural categories and can be assessed meaningfully with continuous scores. Imposing categories on such data is hazardous. It creates the impression of discrete patterns of response where the data does not suggest any, it obscures the arbitrary nature of criteria used to form the categories, and as we have shown, it can spuriously inflate the differences between groups

References (5)

  • NICE; National Institute for Clinical Excellence, 2004; Vol....
  • I. Kirsch et al.

    Prev Treat

    (2002)
There are more references available in the full text version of this article.

Cited by (69)

View all citing articles on Scopus
View full text