Self-reported versus clinician-rated symptoms of depression as outcome measures in psychotherapy research on depression: A meta-analysis

https://doi.org/10.1016/j.cpr.2010.06.001Get rights and content

Abstract

It is not well-known whether self-report measures and clinician-rated instruments for depression result in comparable outcomes in research on psychotherapy. We conducted a meta-analysis in which randomized controlled trials were included examining the effects of psychotherapy for adult depression. Only studies were included in which both a self-report and a clinician-rated instrument were used. We calculated the effect size (Hedges' g) based on the self-report measures, the effect size based on the clinician-rated instruments, and the difference between these two effect sizes (Δg). A total of 48 studies including a total of 2462 participants was included in the meta-analysis. The differential effect size was Δg = 0.20 (95% CI: 0.10–0.30), indicating that clinician-rated instruments resulted in a significantly higher effect size than self-report instruments from the same studies. When we limited the effect size analysis to those studies comparing the HRSD with the BDI, the differential effect was somewhat smaller, but still statistically significant (Δg = 0.15; 95% CI: 0.03–0.27). This meta-analysis has made it clear that clinician-rated and self-report measures of improvement following psychotherapy for depression are not equivalent. Different symptoms may be more suitable for self-report or ratings by clinicians and in clinical trials it is probably best to include both.

Introduction

Both self-report and clinician-rated symptom scales of depression are used widely as outcome measures in psychotherapy research. While the correlation between self-report and clinician-rated instruments usually is strong and statistically significant (Domken, Scott, & Kelly, 1994), the agreement in individual cases often is not (Rush et al., 2006). These discrepancies could be due to differences in item content for the clinician and self-report scales, different weightings of symptom items, clinician or patient biases in describing symptomatology, limited insight of the patient (and/or clinician), cognitive deficits, clinical factors such as depression subtypes, severity, comorbidity, and personality traits, and patient demographic characteristics, such as education, ethnicity, and socioeconomic factors (Corruble et al., 1999, Rush et al., 2006).

Several earlier meta-analytic studies of treatments for depression have examined whether or not self-reported and clinician-rated scales result in comparable outcomes (Edwards et al., 1984, Greenberg et al., 1992, Lambert et al., 1986). Most of these studies found that effect sizes based on self-reported measures were smaller than effect sizes based on clinician-rated scales, and this was true for psychological as well as pharmacological treatments. These studies were, however, based on small numbers of studies, and did not use modern meta-analytic techniques to test whether the effect sizes differed significantly from each other. They also did not examine possible moderators of differential effect sizes, levels of heterogeneity, and possible publication bias. More recent meta-analyses have also examined the effects of studies based on self-reported and clinician-rated scales (Pinquart et al., 2006, Pinquart et al., 2007). These studies confirmed that clinician-rated instruments resulted in larger effect sizes than self-report scales, both in pharmacotherapy and psychotherapy. These studies did not, however, test whether the differences between the two types of outcome measures were significant, and did not examine possible moderators either. In the current paper, we aim to examine this difference between the two types of outcome measures in more depth by comparing the effects of clinician-rated measures with those of self-rated measures within the same studies.

Whether or not self-report and clinician-rated scales result in comparable outcomes is an important question, in particular in relation to publication and policy-making standards. Usually clinician-rated scales are considered to be the gold standard in depression outcome research. If, however, self-report scales are just as good as clinician-rated scales, this would greatly benefit both clinical practice and clinical trial research (Rush et al., 2006). Furthermore, if self-reported outcomes result in lower symptom improvement levels than clinician-rated outcomes as is suggested by earlier research, it may be better to use self-reported outcomes in meta-analytic research, as these are more conservative and have less chance of resulting in an overestimation of outcomes. On the other hand, self-report instruments may be less sensitive to change.

The Hamilton Rating Scale for Depression (HRSD; Hamilton, 1960) is by far the most used clinician-rated instrument in depression treatment studies (Bagby et al., 2004, Williams, 2001). The original HRSD has 21 items, although usually only the first 17 are used to assess the severity of depression. There are, however, also versions with 24 items (Williams, 2001) and 26 items (Butler & Waelde, 2009) available. Although the interrater reliability is good in most studies (Bagby et al., 2004), internal consistency is found to be insufficient in some studies. Despite these limitations, the HRSD remains the gold standard among the clinician-rated measures for depression. Other important clinician-rated instruments that have been developed include the Montgomery–Asberg Depression Scale (MADRS; Montgomery & Asberg, 1979), which was designed to be sensitive to treatment effects and does not rely on physical symptoms as heavily as the HRSD, and in older studies the Raskin Scale (Raskin, 1988).

The best known and most used self-report measure of depression is the Beck Depression Inventory (BDI; Beck, Ward, Mendelson, Mock, & Erbaugh, 1961), and its revised version, the BDI-II (Beck et al., 1996). The BDI and the BDI-II consist of 21 items with 4 answer options each. The reliability and validity of the BDI and BDI-II are generally considered to be good (Burt & Ishak, 2002, Richter et al., 1998), although it has been suggested that it is too exclusively based on a cognitive model (Hagen, 2007). Other self-report measures that are much used include the Center for Epidemiological Studies Depression Scale (CES-D; Radloff, 1977), which is often used in epidemiological research, the Zung Self-Rating Depression Scale (Zung, 1965), the Geriatric Depression Scale (GDS, Yesavage et al., 1983), which was specifically designed to measure depressive symptoms in older adults, and the Edinburgh Postpartum Depression Scale (EPDS; Cox, Holden, & Sagovsky, 1987), used to measure symptoms of postpartum depression.

We decided to conduct a new meta-analytic study in which we compared the two types of outcomes in studies which included both self-report and clinician-rated scales. This allowed us to compare the two directly and test whether they result in comparable outcomes. We wanted to compare self-report and clinician-rated instruments in a homogeneous set of studies. We chose to examine this in the acute treatment of depression, because this is a more or less homogeneous area of research, because it comprises a considerable number of studies, allowing us to examine subsets of studies and moderators, and because it is a clinically relevant area of research.

Section snippets

Identification and selection of studies

We used a database of 1120 papers on the psychological treatment of depression. This database has been described in detail elsewhere (Cuijpers, van Straten, Warmerdam, & Andersson, 2008) and has been used in a series of earlier meta-analyses (www.evidencebasedpsychotherapies.org). It was developed through a comprehensive literature search (from 1966 to January 2010) in which we examined 10,346 abstracts in Pubmed (1831 abstracts), PsycInfo (2943), Embase (3087) and the Cochrane Central Register

Included studies

A total of 48 studies including a total of 2462 participants (1459 in the psychotherapy conditions and 1003 in the control conditions) met all criteria and were included in the meta-analysis. In these 48 studies 70 psychotherapy conditions were compared to a control group. Selected characteristics of the included studies are presented in Table 1.

Patients were recruited through community announcements in 28 studies, 14 studies used clinical samples, and 6 used other recruitment methods

Discussion

The overall aim of this meta-analytic review was to directly investigate whether clinician-rated and self-reported improvements differ in research on psychotherapy for depression. Results showed a small but statistically significant advantage for ratings done by clinicians (Δg = 0.21) when these were directly compared with self-report symptom measures within the same studies. This indicates that either self-report measures are more conservative or that clinician-rated improvement is more

References1 (82)

  • R.M. Bagby et al.

    The Hamilton Depression Rating Scale: Has the gold standard become a lead weight?

    The American Journal of Psychiatry

    (2004)
  • A.T. Beck et al.

    Manual for the Beck Depression Inventory-II

    (1996)
  • A.T. Beck et al.

    An inventory for measuring depression

    Archives of General Psychiatry

    (1961)
  • W. *Bowers et al.

    Use of computer-administered cognitive–behavior therapy with depressed inpatients

    Depression

    (1993)
  • D. *Bowman et al.

    The efficacy of self-examination therapy and cognitive bibliotherapy in the treatment of mild to moderate depression

    Psychotherapy Research

    (1995)
  • T. Burt et al.

    Outcome measures in mood disorders

  • L.D. Butler et al.

    Meditation with yoga, group therapy with hypnosis, and psychoeducation for long-term depressed mood: A randomized pilot trial

    Journal of Clinical Psychology

    (2009)
  • L.D. *Butler et al.

    Meditation with yoga, group therapy with hypnosis, and psychoeducation for long-term depressed mood: A randomized pilot trial

    Journal of Clinical Psychology

    (2008)
  • K.M. *Carpenter et al.

    Developing therapies for depression in drug dependence: Results of a stage 1 therapy study

    The American Journal of Drug and Alcohol Abuse

    (2008)
  • L.G. *Castonguay et al.

    Integrative cognitive therapy for depression: A preliminary investigation

    Journal of Psychotherapy Integration

    (2004)
  • J. Cohen

    Statistical power analysis for the behavioral sciences

    (1988)
  • J. Cox et al.

    Detection of postnatal depression, development of the Edinburgh Postnatal Depression Scale

    The British Journal of Psychiatry

    (1987)
  • P. Cuijpers et al.

    Are psychological and pharmacological interventions equally effective in the treatment of adult depressive disorders? A meta-analysis of comparative studies

    The Journal of Clinical Psychiatry

    (2008)
  • P. Cuijpers et al.

    Psychological treatment of depression: A meta-analytic database of randomized studies

    BMC Psychiatry

    (2008)
  • S.B. *Daughters et al.

    Effectiveness of a brief behavioral treatment for inner-city illicit drug users with elevated depressive symptoms: The Life Enhancement Treatment for Substance Use (LETS Act!)

    The Journal of Clinical Psychiatry

    (2009)
  • R. De Jong et al.

    Effectiveness of two psychological treatments for inpatients with severe and chronic depressions

    Cognitive Therapy and Research

    (1986)
  • R.J. *DeRubeis et al.

    Medication versus cognitive behavior therapy for severely depressed outpatients: Mega-analysis of four randomized comparisons

    The American Journal of Psychiatry

    (1999)
  • S. *Dimidjian et al.

    Randomized trial of behavioral activation, cognitive therapy, and antidepressant medication in the acute treatment of adults with major depression

    Journal of Consulting and Clinical Psychology

    (2006)
  • N.J. *Dunn et al.

    A randomized trial of self-management and psychoeducational group therapies for comorbid chronic posttraumatic stress disorder and depressive disorder

    Journal of Traumatic Stress

    (2007)
  • S. Duval et al.

    Trim and fill: A simple funnel-plot-based method of testing and adjusting for publication bias in meta-analysis

    Biometrics

    (2000)
  • B.C. Edwards et al.

    A meta-analytic comparison of the Beck Depression Inventory and the Hamilton Rating Scale for Depression as measures of treatment outcome

    The British Journal of Clinical Psychology

    (1984)
  • I. *Elkin et al.

    National Institute of Mental Health Treatment of Depression Collaborative Research Program

    Archives of General Psychiatry

    (1989)
  • M. *Floyd et al.

    Cognitive therapy for depression: A comparison of individual psychotherapy and bibliotherapy for depressed older adults

    Behavior Modification

    (2004)
  • E. Frank et al.

    Conceptualization and rationale for consensus definitions of terms in major depressive disorder. Remission, recovery, relapse, and recurrence

    Archives of General Psychiatry

    (1991)
  • K.E. *Freedland et al.

    Treatment of depression after coronary artery bypass surgery a randomized controlled trial

    Archives of General Psychiatry

    (2009)
  • E.A. Gaffan et al.

    Researcher allegiance and meta-analysis: The case of cognitive therapy for depression

    Journal of Consulting and Clinical Psychology

    (1995)
  • R.P. Greenberg et al.

    A meta-analysis of antidepressant outcome under “blinder” conditions

    Journal of Consulting and Clinical Psychology

    (1992)
  • B. Hagen

    Measuring melancholy: A critique of the Beck Depression Inventory and its use in mental health nursing

    International Journal of Mental Health Nursing

    (2007)
  • M. Hamilton

    A rating scale for depression

    Journal of Neurology Neurosurgery & Psychiatry

    (1960)
  • R. *Harley et al.

    Adaptation of dialectical behavior therapy skills training group for treatment-resistant depression

    The Journal of Nervous and Mental Disease

    (2008)
  • M. *Hautzinger et al.

    Kognitive Verhaltenstherapie bei Depressionen im Alter: Ergebnisse einer Kontrollierten Vergleichsstudie unter Ambulanten Bedingungen an Depressionen Mittleren Schweregrads

    Zeitschrift für Gerontologie und Geriatrie

    (2004)
  • Cited by (223)

    View all citing articles on Scopus
    1

    The references marked with an asterisk are included in the meta-analysis.

    View full text