Article Text

Download PDFPDF

Original research
Goal attainment scaling as an outcome measure for randomised controlled trials: a scoping review
  1. Benignus Logan1,2,
  2. Dev Jegatheesan3,4,
  3. Andrea Viecelli3,5,
  4. Elaine Pascoe5,
  5. Ruth Hubbard2,6
  1. 1Medicine Service Line, Redcliffe Hospital, Redcliffe, Queensland, Australia
  2. 2Centre for Health Services Research, Faculty of Medicine, The University of Queensland, Saint Lucia, Queensland, Australia
  3. 3Department of Nephrology, Princess Alexandra Hospital, Woolloongabba, Queensland, Australia
  4. 4Centre for Kidney Disease Research, The University of Queensland—Saint Lucia Campus, Saint Lucia, Queensland, Australia
  5. 5Australasian Kidney Trials Network, Faculty of Medicine, The University of Queensland—Saint Lucia Campus, Saint Lucia, Queensland, Australia
  6. 6Department of Geriatric Medicine, Princess Alexandra Hospital, Woolloongabba, Queensland, Australia
  1. Correspondence to Dr Benignus Logan; benignus.logan{at}uq.edu.au

Abstract

Objectives (1) Identify the healthcare settings in which goal attainment scaling (GAS) has been used as an outcome measure in randomised controlled trials. (2) Describe how GAS has been implemented by researchers in those trials.

Design Scoping review using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for scoping reviews approach.

Data sources PubMed, CENTRAL, EMBASE and PsycINFO were searched through 28 February 2022.

Eligibility criteria English-language publications reporting on research where adults in healthcare settings were recruited to a randomised controlled trial where GAS was an outcome measure.

Data extraction and synthesis Two independent reviewers completed data extraction. Data collected underwent descriptive statistics.

Results Of 1,838 articles screened, 38 studies were included. These studies were most frequently conducted in rehabilitation (58%) and geriatric medicine (24%) disciplines/populations. Sample sizes ranged from 8 to 468, with a median of 51 participants (IQR: 30–96). A number of studies did not report on implementation aspects such as the personnel involved (26%), the training provided (79%) and the calibration and review mechanisms (87%). Not all trials used the same scale, with 24% varying from the traditional five-point scale. Outcome attainment was scored in various manners (self-report: 21%; observed: 26%; both self-report and observed: 8%; and not reported: 45%), and the calculation of GAS scores differed between trials (raw score: 21%; T score: 47%; other: 21%; and not reported: 66%).

Conclusions GAS has been used as an outcome measure across a wide range of disciplines and trial settings. However, there are inadequacies and inconsistencies in how it has been applied and implemented. Developing a cross-disciplinary practical guide to support a degree of standardisation in its implementation may be beneficial in increasing the reliability and comparability of trial results.

PROSPERO registration number CRD42021237541.

  • statistics & research methods
  • rehabilitation medicine
  • general medicine (see internal medicine)

Data availability statement

No data are available. Data will not be shared.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • Completing a scoping review has allowed for an exploratory analysis of goal attainment scaling as a research methodology.

  • This work benefits from the collection of a comprehensive range of data items.

  • Included articles in this review were limited to randomised controlled trials only.

  • Data for analysis were limited to information published in either the primary article or associated protocol, which often lacked detail.

Introduction

Person-centred care is gaining attention as a way to help orient healthcare towards what matters most to an individual, and is recognised as a pillar of quality healthcare and research.1 2 A key component of person-centred care is goal setting.1 2 One method for setting goals, and scoring the extent to which they are achieved, is the outcome measurement instrument of Goal Attainment Scaling (GAS).3 It has been used in clinical and research settings across various healthcare disciplines including rehabilitation,4–6 geriatric medicine,7 8 community health9 and drug trials.10 The basic steps of GAS include identifying goals; defining the current (baseline) status; identifying potentially better and worse attainment outcomes on a five-point scale, with consideration of patient and environmental factors such as their current status; weighting the goals; and at follow-up scoring the achieved outcome against the stated possible attainment levels.11 The extent to which goals are achieved is standardised into a T score by the formula3 12:

Embedded Image

(where wi=weight assigned to the goal area, xi=the attained score for the goal area).

Using GAS has several advantages for researchers, particularly given its ability to be applied in research settings where other outcome measures may not be suitable due to heterogeneity of participants or outcomes.13–15 GAS captures what matters most to a participant,6 16 and is an outcome measure which can truly be tailored to recognise these individual’s priorities with respect to both the goal’s domain and the scaled outcome attainment levels articulated. It also serves as a tool for monitoring progress throughout a trial.17 However, challenges exist in how GAS is practically implemented. Concerns relate to poorly written goals and scales,6 12 the investment of time required,6 14 17 18 suboptimal facilitator knowledge3 19 and over-reliance on self-reported scores.19 20 While some researchers have found GAS to be a valid,21–23 reliable21 22 and responsive5 7 24 outcome measure, others question its psychometric properties.14 19 25 Often studies using GAS do not specifically report on such aspects,14 and arguably proof of validity or reliability in one setting cannot be extrapolated to another.13 19 26 Suggestions to address validity and reliability concerns include having third parties review goals and the outcomes reached,6 7 12 13 19 27 28 confirming goals are related to the intervention being assessed,19 28 ensuring equidistance between outcome levels19 and having adequate facilitator training.12 19

The extent to which researchers have used GAS as an outcome measure in randomised controlled trials in unknown. It is undocumented how GAS has been practically implemented by researchers in trials, and the extent to which the concerns noted above are borne out in practice. A prior systematic review focused on the measurement properties of GAS as an outcome measure,14 hence we have not explored the psychometric properties in this research.

This scoping review has been undertaken to (1) identify the healthcare settings in which GAS has been used as an outcome measure for randomised controlled trials and (2) identify and analyse the gaps as to how the implementation aspects of GAS have been reported when used as an outcome measure in those trials.

Methods

A scoping review was selected as it allowed for an exploratory analysis of GAS as a research methodology.29 The protocol for this review was registered with PROSPERO (CRD42021237541). Findings are reported according to the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for scoping reviews.30

The PubMed, CENTRAL, EMBASE and PsycINFO electronic databases were searched for articles published from their respective inceptions through to 28 February 2022. To allow for an all-inclusive result, a broad ‘all fields’ search for ‘goal attainment scaling’ OR ‘goal attainment scal*’ without any limits was undertaken in consultation with a research librarian. A broader search strategy was not undertaken given the specificity of the term ‘goal attainment scaling’, and the review’s focus on this outcome measurement instrument alone rather than other methods of goal setting.

Publications were eligible for inclusion if they were written in English, were published or ‘in press’ at the search date, included only participants aged 18 and over, were conducted in healthcare settings (including outpatient and community health), and had a randomised or quasi-randomised controlled trial design where GAS was an outcome measure.

Articles were excluded if they did not meet the stated inclusion criteria. Specifically, this included studies where a caregiver rather than the patient set goals, studies where the design was not a randomised controlled trial (including published protocols for as yet incomplete or unpublished randomised controlled trials), if GAS was an intervention (not an outcome measure), or where a modified GAS method was used (eg, GAS-Hem or GAS-Light).

One author (BL) completed the searches. Two reviewers (BL and DJ) used Covidence31 to independently screen titles and abstracts, and complete full-text reviews of potentially relevant articles. Any conflicts were reviewed and resolved by a third reviewer (AV).

A data-charting form was developed and piloted on three studies by two reviewers (BL and DJ). This form was then finalised and loaded into Covidence for data extraction. Two reviewers (BL and DJ) independently completed the data charting for each article, with a third reviewer (AV) adjudicating any conflicts. Data were collected as well from any published protocols, or supplementary material, which were publicly available. Investigators of the included studies were not contacted to obtain missing data.

Information was extracted in relation to the setting in which GAS was used as an outcome measure. Specifically: location of study, number of study sites, discipline, trial design, population, sample size, age, intervention, comparator and outcome type (ie, primary or secondary outcome).

Information relating to GAS implementation included personnel involved, training provided, calibration and review processes, administration process, number of goals set, goal domains, scale range used, approach to scoring baseline performance, time to complete GAS, support provided to participants, review interval, approach to scoring, calculation used for GAS score, action taken after review and use of existing GAS guidelines.

The data collected were aggregated through the use of descriptive statistics.

Patient and public involvement

There was no direct patient or public involvement in this review. The review does address an outcome measure with potential to more meaningfully involve patients in research endeavours.

Results

Search results

The primary search yielded 2,993 articles. After removal of duplicates, 1,838 abstracts underwent screening. A total of 121 articles proceeded to full-text review, with 83 of these excluded as they did not meet the inclusion criteria. Ultimately, 38 studies were included. Figure 1 provides an overview of the selection process resulting from the search run on 28 February 2022.

Figure 1

Preferred Reporting Items for Systematic Reviews and Meta-Analyses flow diagram for study selection. GAS, goal attainment scaling.

Study and participant characteristics

A summary of included study and participant characteristics are provided in table 1.

Table 1

Overview of study and participant characteristics

Over half of the studies were completed in the rehabilitation discipline (58%, n=22/38, where ‘n’ is the number of studies), with a significant number also completed in geriatric medicine (24%, n=9/38) and neurology (11%, n=4/38). Most studies were at a single centre (61%, n=23/38), and three studies (8%) included participants from two or more countries.32–34

None had a quasi-randomised design. While the studies included date back to 2000, a large proportion of the studies were published in the last 5 years (42%, n=16/38), or 6–10 years ago (34%, n=13/38).

The majority (84%) of studies were conducted in an outpatient setting, which included community-based or home-based delivery of an intervention or assessment of outcome measure. The remainder (16%) were either conducted entirely in the inpatient setting or in a combination of inpatient and outpatient settings.

Sample sizes varied from 8 to 468 participants, with a median of 51 (IQR: 30–96). Eight studies had a pilot or feasibility intent.35–42 Most frequently, studies included participants who were stroke survivors (34%, n=13/38), had a brain injury (18%, n=7/38) or were community-dwelling older people (16%, n=6/38).

A broad range of interventions were reported including medications (eg, botulinum toxin), procedures (eg, electrical stimulator-guided obturator nerve block), psychotherapy (internet-based cognitive behavioural therapy) and goal management training.

Approaches to implementing GAS

Table 2 provides details of how investigators reported on the various implementation aspects of GAS in their trials.

Table 2

Approach by investigators to the implementation of GAS

GAS was a primary outcome in 14 (37%) studies and a secondary outcome in 24 studies (63%). Of the 14 studies in which GAS was a primary outcome, the basis of the sample size’s determination only provided a statistical rationale in 6 of them. The staff responsible for administering GAS differed between studies. In 16 studies (42%), a mix of healthcare professionals were involved including psychologists, research nurses and doctors. Physiotherapists or occupational therapists were responsible in 12 studies (32%), with it not reported in 10 studies (26%).

The nature of the training provided to the personnel administering GAS was not articulated in 29 (76%) of the studies. Of the eight studies which did report on it, there was a variable amount of detail given. Completion of a simulation or mock goal setting session was mentioned in three studies,20 43 44 and nine (24%) studies described using a GAS guide.3 4 45 While these guides were primarily written for rehabilitation medicine, three of the studies referencing them were not conducted in rehabilitation settings.

In five studies (13%), some form of calibration or review of goals was undertaken, each with a different approach. Estival et al46 were the most comprehensive with an external judge assessing each goal on seven criteria, and a process for evaluating whether scores were valid, reliable and meaningful by staff who knew study participants but were not directly involved in the trial. Another study47 reported that goals were finalised at a team conference, and a blinded geriatrician assessed the reliability of the goal setting. In two studies,48 49 therapists worked with the participant to ensure their goals were realistic. A third-party review of the first three GAS administered by each investigator was completed in one of the studies.33

Scoring of goal attainment was often not clearly reported, or not commented on at all. In 8 studies (21%), the attainment score was based on participant self-report, in 10 (26%) it was based on objective observation (such as a blinded assessor attending a patient’s home48), and in 3 (8%) it was a mix of both self-report and observation. The use of a blinded assessor or third-party reviewer was mentioned in several studies, but it is unclear whether they relied on assessor observation or on participants’ self-reporting.

Application of GAS

Table 3 provides an overview of the decisions investigators took in their application of GAS as an outcome measure.

Table 3

Decisions taken by investigators on how GAS was used

Goals set in GAS are typically scaled to five possible levels, from −2 to +2.3 50 51 A five-point scale was used in 76% (n=29/38) of the studies. Three studies used a six-point scale (−3 to +2),32–34 and one a seven-point scale (−2 to +4).52 Five studies (14%) did not report their approach,39 41 53–55 so it is unclear whether they used the typical five-point scale or not. How baseline performance was scored on the GAS varied between studies. Most studies (66%, n=25/38) did not report it. Where it was reported, −1 was the most frequent score (16%, n=6/38). There was heterogeneity in the calculation and reporting of GAS outcomes. Most commonly, a T score was derived (47%, n=18/38). Eight studies (21%) used raw scores, and eight (21%) used other approaches. Four (11%) did not specify how their calculation was undertaken.34 38 49 56

Other implementation aspects

A summary of the characteristics of goals set by patients is provided in table 4.

Table 4

Characteristics of goals set by participants for goal attainment scaling

Predetermined goal domains were offered in 20 (53%) studies. The type and number of domains varied. In one study,57 participants who had burns were asked to set goals in four domains: mental health, physical health, vocational and social. In another,58 community-dwelling older adults were asked to set a functional mobility goal. Fourteen studies (37%) did not comment on the number of goals that participants were required to set. Only one study,20 reported the time allocated to set goals (30 min).

Discussion

This scoping review provides insights into the way GAS has been used and implemented in research settings. Importantly, it shows that GAS as an outcome measure has been used across a range of populations, disciplines, healthcare settings and interventions. The variety of settings in which GAS has been used illustrates its adaptability, and its potential feasibility for use by a range of investigators. However, implementation aspects are inadequately reported and the manner in which the GAS scale was used and scored was sometimes inconsistent. This may threaten its robustness as an outcome measure, and diminish interest from triallists to use it as a means to facilitate measurement of patient-important outcomes.

A large number of studies did not report implementation aspects such as which personnel administered GAS (26%), what training was provided to facilitators (79%) and whether a calibration or review process was undertaken (87%). The absence of these considerations may therefore threaten the validity and reliability of GAS. The way GAS attainment was scored was often not reported (45%) or relied solely on patient self-reporting (21%). While self-reporting without the involvement of a blinded assessor may be pragmatic, it is vulnerable to imprecision given a reliance on a participant’s insight, awareness, recall and denial.19 20 Even scoring by a participant’s clinician is not without issues. It has been found there is low agreement between their ratings and that of an external independent assessor.27 The practicalities of scoring may be a matter underappreciated by investigators.

This review shows variability in how GAS has been used. First, a five-point scale (ie, −2 to +2) was not always used despite it being recognised as the preferred approach given that was how Kiresuk and Sherman first designed it,3 50 51 and statistical analyses support a five-point scale.15 Second, the differences in how baseline performance was handled is consistent with prior commentary,13 and is unsurprising given Kiresuk and Sherman did not provide specific guidance when initially describing GAS.50 51 While such heterogeneity may reflect specific participant populations, or an intent of researchers to allow it to be tailored to each participant, it is notable that most studies (66%) did not report their approach and thus their rationale cannot be understood. Finally, there were differences in how GAS scores were analysed. As has been reported on previously,12 not all researchers report scores as a T score with some instead reporting raw scores or a change from baseline. In this review, over half the studies (53%) did not report use of a T score. This is problematic given it is central to GAS and how it was first designed by Kiresuk and Sherman.50 Where investigators diverge from not using the five-point (−2 to +2) scale, or not using the T score, it raises concerns they have moved too far away from the validated process to refer to it as GAS. It also impairs the comparison of scores across trials. Addressing this is important to ensuring GAS’ fidelity, and warrants the consideration of researchers who use it as an outcome measure.

The heterogeneity and incomplete reporting of GAS measurement and implementation makes the interpretation and comparison of trial results challenging. A potential implication of inconsistent GAS implementation includes introducing risk of bias if delivery is too leading. Further, if scales are poorly constructed they may be open to selective interpretation with assessment erring more favourably. There is a growing recognition that detailed information on how GAS is practically implemented should be provided in publications.12 13 19 In the absence of guidelines for GAS development and scoring, researchers should be detailing their implementation strategy to facilitate reproducibility.59 Our review shows that this is not occurring frequently, which threatens the robustness of GAS as an outcome measure in the trials it has been used in.

Practical guidelines3 4 have been published which may help address some of the implementation issues, particularly to highlight the importance of the five-point scale (−2 to +2) and the standardisation of outcomes into a T score, as well as providing a resource for facilitators to be appropriately educated on GAS. Only 24% of the trials specifically noted whether they had made reference to guidelines such as those from Turner-Stokes,3 Bovend’Eerdt4 or Krasny-Pacini,19 all of which were written with a focus on rehabilitation medicine.3 4 Guidance that is more interdisciplinary in nature may be beneficial given 43% of the studies in this review occurred in disciplines outside of rehabilitation medicine. This would be timely given the increased frequency with which GAS is being used in recent years. Caution should be exercised, however, in seeking to standardise and operationalise the GAS process too stringently, lest it risk losing its adaptability to be personalised to each unique participant.

The limitations of this review include appraisal of GAS as an outcome measure being constrained by a lack of granularity in the methodology sections and published protocols. Actions may have been taken that were not documented in the published manuscripts. Only those studies with a randomised controlled trial design and adult participants were included in this scoping review. This may have limited insights into the scope of findings and transferability. Further, this scoping review does not consider or explore the possible therapeutic qualities which the act of setting goals in GAS may have independent of the intervention being assessed.20 39

Conclusion

GAS is a valuable tool for researchers to assess participant-important priorities, due to its demonstrated ability to be deployed as an outcome measure in such diverse trial populations and settings. It holds potential for more widespread use to support person-centred care. However, inconsistencies identified in how GAS is applied, and variations in implementation and reporting, do raise the need for greater standardisation to address threats to its validity and reliability. Further work is needed to better establish the credentials of GAS’ psychometric properties. This may extend to the development of an implementation guideline applicable to all disciplines and populations.

Data availability statement

No data are available. Data will not be shared.

Ethics statements

Patient consent for publication

Ethics approval

Not applicable.

Acknowledgments

Search strategy generated in collaboration with Christine Dalais, Liaison Librarian at The University of Queensland.

References

Footnotes

  • Twitter @benignuslogan, @dev_jeg, @a_viecelli

  • Contributors BL wrote the protocol, which DJ, AV, EP and RH reviewed. BL and DJ completed all screening and data extraction, with AV adjudicating any conflicts. BL prepared the figures and tables. BL wrote the first draft of the main manuscript, which DJ, AV, EP and RH reviewed, edited and endorsed. BL is the guarantor for this work, and accepts full responsbility for it.

  • Funding Article processing charge will be paid by research funding support provided by Redcliffe Hospital (Queensland Health), employer of BL.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.