Elsevier

Addictive Behaviors

Volume 30, Issue 3, March 2005, Pages 403-413
Addictive Behaviors

Measuring alcohol consumption: A comparison of graduated frequency, quantity frequency, and weekly recall diary methods in a general population survey

https://doi.org/10.1016/j.addbeh.2004.04.022Get rights and content

Abstract

Objective: The objective of this study is to compare self-reports of alcohol consumption obtained by graduated frequency (GF), quantity frequency (QF), and prospective weekly diary methods in a Swiss general population survey using a within-subject design. Another objective is to examine the consumption means, ranking order, and drinking classifications and to relate differences in consumption means and classification inconsistencies to social and drinking characteristics. Method: The data came from the first wave of a longitudinal study on changes in alcohol consumption. A subsample of 767 respondents who completed the three measures was examined. The sample design was accounted for. Results: Weekly drinking diary (WDD) yielded over 50% more alcohol consumption than the QF and GF measures did. Measures on quantity and graduated frequencies did not differ significantly. There were more light drinkers found on QF and more harmful drinkers on the WDD. Classification inconsistencies between GF and the weekly drinking diary were significantly related to gender and problematic drinking.

Introduction

Self-reports are widely used to assess consumption in alcohol research because, in contrast to aggregated level sales data, they permit the analysis of different consumer groups Lemmens et al., 1988, Midanik, 1982, Rehm, 1998b. Sound research results depend on the reliability and validity of measurement. Self-reports of alcohol consumption have been seen to be generally reliable (for an overview, see Harris et al., 1994, Simpura & Poikolainen, 1983, Sobell & Sobell, 1990). As regards validity, the absence of a “gold standard” makes investigations of the criterion validity of self-reports difficult Midanik, 1982, Rehm, 1998b. According to the literature on survey research, systematic variations in measurements may be related to (a) the construct validity of the assessment instrument, that is, the degree the aspects of the construct are addressed by measurement, and (b) the assessment mode Aday, 1996, Groves, 1989. The current study examines alcohol consumption measures stemming from a telephone administered quantity–frequency instrument (QF), a telephone administered graduated–frequency instrument (GF), and a self-administered weekly drinking diary (WDD). With respect to construct validity, QF does not capture variability of drinking, while GF and WDD do. In relation to assessment modes, both QF and GF are retrospective measures, while WDD is prospective and may therefore be less prone to recall errors. Thus, the comparison of QF and GF should concentrate mainly on the effects of variability of drinking, whereas the comparison of GF and WDD should account for differences in recall errors. Previous studies on the comparison of measurement instruments were mainly conducted in Northern Europe, North America, Australia, and New Zealand Feunekes et al., 1999, Rehm, 1998b. When compared with these regions, drinking can be said to be more regular in Switzerland Allamani et al., 2000, Lemmens, 2001. Thus, because variability of drinking may be a minor explanatory factor for the differences in measurement instruments in Switzerland, potential differences between consumption values on WDD and GF should be indicative of memory errors.

Concurrent validity addresses the degree of the variations in the assessment of the same construct by different measures Aday, 1996, Groves, 1989. It has been examined for a variety of measures, including QF and GF, but rarely for prospective WDD. Generally, higher consumption values were found for GF than for QF Midanik, 1994, Rehm, 1998b, Room, 1990. So far, WDD has been used in clinical settings Fiellin, 2000, McManus et al., 1999 and nutritional epidemiology Franson & Rossner, 2000, Lambe et al., 2000, Mathews et al., 2000. Applications of WDD in the field of alcohol research are sparse Corti et al., 1990, Hilton, 1989, Lemmens et al., 1992, as is literature on its reliability and validity (Rehm, 1998b). Existing research shows that WDD generally yields higher consumption than do other measures (for an overview, see Leigh, 2000). To our knowledge, the concurrent validity of GF and WDD has not yet been investigated in a general population, and comparisons between WDD and GF were limited to a volunteer group of 83 relatively heavy drinkers (Hilton, 1989) and a volunteer group of 52, mainly, female drinkers (Poikolainen, Podkletnova, & Alho, 2002).

Assessment of alcohol consumption usually includes the three dimensions of quantity, frequency, and variability Alanko, 1984, Room, 1990. Differences in consumption may be related to a, more or less, accurate coverage of these dimensions by the assessment instrument. Consistent is the finding that the more detailed the questions on alcohol consumption, the more consumption shows up Dawson, 1998, Knibbe & Bloomfield, 2001, McCann et al., 1999. For instance, it has been proposed that unequal assessment of variability explains higher drinking volume on GF than on QF Hilton, 1989, Midanik, 1994. Also, recalls may be plagued with errors due to the failure of retrieval, such as forgetting, interfering, or confusing drinking events in retrospective measures. Thus, prospective measures are generally assumed to be more accurate than retrospective ones are, and prospective drinking diaries have been used to validate retrospective measures Armstrong et al., 1992, Lemmens et al., 1988, Poikolainen et al., 2002.

One drawback in comparative studies is that QF and GF often cover longer reference periods, whereas prospective diaries are usually used for short-term investigations because of the response burden on the individuals and the potential lack of compliance for longer periods (Leigh, 2000). Short reference periods may result in insufficient assessment of exceptional drinking occasions (Dawson, 1998) or overestimation of abstention (Rehm, Walsh, Xie, Greenfield, & Single, 1997). In the present study, this drawback is partly overcome by using a weekly reference period for GF and WDD. QF had a reference period of 6 months. However, QF should be more or less stable, independent of the length of the reference period, because it is expected to measure modal consumption Duffy & Alanko, 1992, Kühlhorn & Leifman, 1993. Despite the lack of empirical evidence in general populations, WDD can be hypothesized to yield the highest consumption, as it avoids averaging processes and because consumption is rapidly recalled.

At a first stage, the present study investigates the differences of overall self-reported alcohol consumption on QF, GF, and WDD. At a second stage, the stability of the rank order of the individuals classified across measures is addressed because validity issues, especially in epidemiological studies, should not be restricted to an overall consumption mean (Rehm, 1998b). Common values for rank order correlations are comprised between 0.4 and 0.7 Feunekes et al., 1999, Lemmens et al., 1992, Rehm et al., 1997. At a third stage, the present study compares the classification of the respondents according to their drinking status. Comparisons between QF and GF showed that GF results in higher proportions of heavy drinkers and QF in higher proportions of light drinkers Midanik, 1994, Rehm, 1998b. Similar differences in the proportions were obtained when comparing diaries with QF (Redman, Sanson-Fisher, Wilkinson, Fahey, & Gibberd, 1987). Finally, at a fourth stage, differences in consumption means and classification inconsistencies were regressed on the demographic and drinking characteristics of the respondents.

Section snippets

Methods

Data were obtained from the first wave (in Spring 1999) of a longitudinal study on changes in alcohol consumption in Switzerland's resident population, aged 15 years or more, using a within-subject design (Heeb, Gmel, Zurbrügg, Kuo, & Rehm, 2003). The study used a two-stage random sample, stratified by linguistic region. The final sample consisted of 4007 individuals surveyed by computer-assisted telephone interviewing. The response rate of 74.8% was similar with those found in Swiss health

Results

Weighted frequencies differed only slightly from the unweighted ones, except for the number of persons in the household (Table 1). Differences in consumption values by demographic and drinking characteristics were consistent across measures. Some respondents indicated no alcohol consumption on GF (13.2%) and WDD (3.4%), as these measures had a 1-week and QF a 6-month reference period.

The consumption mean on WDD (13.91; S.E.=0.55) differed significantly from the means on QF (9.22; S.E.=0.49) and

Discussion

This study examined the differences in self-reported alcohol consumption assessed by QF, GF, and WDD measures in the Swiss general population. Differences were supposed to arise mainly because of two sources: inclusion of variability (QF vs. GF and WDD) and decay of memory (QF and GF vs. WDD). Theoretically, the prospective measures should have a lower decay of memory compared with retrospective measures, and QF should incorporate the least variability Duffy & Alanko, 1992, Kühlhorn & Leifman,

Acknowledgements

The study was supported by the Swiss Alcohol Board and by the National Institute on Alcohol Abuse and Alcoholism (R01 AA13346-01A1).

References (54)

  • Bundesamt für Statistik (BFS)

    Swiss health survey—Initial findings

    (1994)
  • Bundesamt für Statistik (BFS)

    Swiss health survey—Initial findings

    (1998)
  • B. Corti et al.

    Comparison of 7-day retrospective and prospective alcohol consumption diaries in a female population in Perth, Western Australia—Methodological issues

    British Journal of Addiction

    (1990)
  • D.A. Dawson

    Volume of ethanol consumption: Effects of different approaches to measurement

    Journal of Studies on Alcohol

    (1998)
  • J.C.A. Duffy et al.

    Self-reported consumption measures in sample surveys: A simulation study of alcohol consumption

    Journal of Official Statistics

    (1992)
  • H. Fahrenkrug et al.

    Alkohol und gesundheit in der Schweiz

    (1989)
  • G.I.J. Feunekes et al.

    Alcohol intake assessment: The sober facts

    American Journal of Epidemiology

    (1999)
  • D. Fiellin

    Evidence for treatment of alcohol problems in primary care?

    Substance Abuse

    (2000)
  • K. Franson et al.

    Fat intake and food choices during weight reduction with diet, behavioural modification and a lipase inhibitor

    Journal of Internal Medicine

    (2000)
  • G. Gmel

    Antwortverhalten bei Fragen zum Alkoholkonsum—Non-response-bias in schriftlichen Nachbefragungen

    Schweizerische Zeitschrift für Soziologie

    (1996)
  • R.M. Groves

    Survey errors and survey costs

    (1989)
  • T.R. Harris et al.

    Reliability of retrospective self-reports of alcohol consumption among women: Data from a U.S. national sample

    Journal of Studies on Alcohol

    (1994)
  • J.-L. Heeb et al.

    Changes in alcohol consumption following a reduction in the price of spirits: A natural experiment in Switzerland

    Addiction

    (2003)
  • M.E. Hilton

    A comparison of a prospective diary and two summary recall techniques for recording alcohol consumption

    British Journal of Addiction

    (1989)
  • L. Kish

    Survey sampling

    (1965)
  • R.A. Knibbe et al.

    Alcohol consumption estimates in survey in Europe: Comparability and sensitivity for gender differences

    Substance Abuse

    (2001)
  • E.L. Korn et al.

    Analysis of health surveys

    (1999)
  • Cited by (85)

    • Effects of strengthening alcohol labels on attention, message processing, and perceived effectiveness: A quasi-experimental study in Yukon, Canada

      2020, International Journal of Drug Policy
      Citation Excerpt :

      Health literacy was assessed using the Newest Vital Sign assessment tool and responses were categorized as: limited (≤1 correct responses), possibility of limited (2–3 correct responses), adequate literacy (4–6 correct responses), and unknown (don't know/prefer not to say/missing) (Weiss et al., 2005). Alcohol use was measured using the quantity/frequency method (Heeb & Gmel, 2005). Participants were asked to indicate how often they drank alcohol beverages in the past 6 months, and how many drinks they usually drank per occasion.

    • Assessment of alcohol intake: Retrospective measures versus a smartphone application

      2018, Addictive Behaviors
      Citation Excerpt :

      In this case, the data would be more akin to that derived from daily interviews and may still therefore under-estimate actual consumption. In sum, limitations associated with retrospective methods of assessing alcohol intake impact the accuracy and detail of drinking behaviour information (Del Boca & Darkes, 2003; Feunekes et al., 1999; Heeb & Gmel, 2005; Hoeppner et al., 2010; Stockwell et al., 2004, 2008; Townshend & Duka, 2002; Utpala-Kumar & Deane, 2010). While real-time assessment methods involving hand-held electronic devices or interactive voice response systems have shown some promise in terms of overcoming drawbacks associated with retrospective measures, they can be expensive and burdensome (Kuntsche & Labhart, 2013; Shiffman et al., 2008; Trull & Ebner-Priemer, 2013).

    • Alcohol Abuse and Cardiac Disease

      2017, Journal of the American College of Cardiology
    View all citing articles on Scopus
    View full text