Article Text

Download PDFPDF

Validation of the American Orthopaedic Foot and Ankle Society Ankle-Hindfoot Scale Dutch language version in patients with hindfoot fractures
  1. A Siebe De Boer1,
  2. Duncan E Meuffels2,
  3. Cornelis H Van der Vlies3,
  4. P Ted Den Hoed4,
  5. Wim E Tuinebreijer1,
  6. Michael H J Verhofstad1,
  7. Esther M M Van Lieshout1
  8. AOFAS Study Group
    1. 1Trauma Research Unit, Department of Surgery, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
    2. 2Department of Orthopedic Surgery, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
    3. 3Department of Surgery, Maasstad Hospital, Rotterdam, The Netherlands
    4. 4Department of Surgery, Ikazia Hospital, Rotterdam, The Netherlands
    1. Correspondence to Professor Esther M M Van Lieshout; e.vanlieshout{at}erasmusmc.nl

    Abstract

    Objectives The American Orthopaedic Foot and Ankle Society (AOFAS) Ankle-Hindfoot Scale is among the most used questionnaires for measuring functional recovery after a hindfoot injury. Recently, this instrument was translated and culturally adapted into a Dutch version. In this study, the measurement properties of the Dutch language version (DLV) were investigated in patients with a unilateral hindfoot fracture.

    Design Multicentre, prospective observational study.

    Setting This multicentre study was conducted in three Dutch hospitals.

    Participants In total, 118 patients with a unilateral hindfoot fracture were included. Three patients were lost to follow-up.

    Primary and secondary outcome measures Patients were asked to complete the AOFAS-DLV, the Foot Function Index and the Short Form-36 on three occasions. Descriptive statistics (including floor and ceiling effects), reliability (ie, internal consistency), construct validity, reproducibility (ie, test–retest reliability, agreement and smallest detectable change (SDC)) and responsiveness were determined.

    Results Internal consistency was inadequate for the AOFAS-DLV total scale (α=0.585), but adequate for the function subscale (α=0.863). The questionnaire had adequate construct validity (82.4% of predefined hypotheses were confirmed), but inadequate longitudinal validity (70.6%). No floor effects were found, but ceiling effects were present in all AOFAS-DLV (sub)scales, most pronounced from 6 to 24 months after trauma onwards. Responsiveness was only adequate for the pain and alignment subscales, with a SDC of 1.7 points.

    Conclusions The AOFAS Ankle-Hindfoot Scale DLV has adequate construct validity and is reliable, making it a suitable instrument for cross-sectional studies investigating functional outcome in patients with a hindfoot fracture. The inadequate longitudinal validity and responsiveness, however, hamper the use of the questionnaire in longitudinal studies and for assessing long-term functional outcome.

    Trial registration number NTR5613; Post-results.

    • hindfoot
    • fracture
    • reliability
    • responsiveness
    • validity

    This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/

    Statistics from Altmetric.com

    Request Permissions

    If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

    Strengths and limitations of this study

    • This prospective, multicentre, observational study shows substantial, previously unknown information about the performance of the American Orthopaedic Foot and Ankle Society (AOFAS) Ankle-Hindfoot Scale.

    • The topic of the clinical study is relevant for orthopaedic trauma surgeons, since there is growing need for translated and validated patient-reported outcome measures that can be used for determining functional outcome over time.

    • The methodological design of the study is strong, and statistical analyses complied with the COnsensus-based Standards for the selection of health Measurement INstruments guidelines.

    • Although the study is mostly relevant for the Dutch-speaking regions, it is also informative for other regions.

    • Implementation of the AOFAS Ankle-Hindfoot Scale is limited by the fact that a clinician is required to complete the physician-reported part of the questionnaire. This hampers its use in, for example, large-scale registers.

    Background

    Hindfoot fractures are rare, but invalidating injuries. Since most patients are in their wage-earning age combined with the long-term disabilities, these injuries have a high socioeconomic impact.1 2 The incidence rate of calcaneal fractures is 11.5 per 100 000 person-years and these fractures occur 2.4 times more frequently in men than women.3 Fractures of the talus are even more rare with a reported annual incidence of 3.2 per 100 000, and occur 4.5 times more often in men.4 Despite the facts that these fractures are relatively rare, they have received considerable attention in recent literature, presumable by the long-term recovery and therewith socioeconomic burden.

    To monitor functional outcome, quality of life and recovery after treatment, patient-reported outcome measures and other instruments are increasingly used in clinical practice and clinical research. The American Orthopaedic Foot and Ankle Society (AOFAS) Ankle-Hindfoot Scale is one of the most used assessment tools in foot surgery.5 This clinical rating system combines a patient-reported part and a physician-reported part. In its original language version the AOFAS Ankle-Hindfoot Scale, as a complete scale has been shown to be responsive and valid.6–9 The study populations involved non-traumatic diagnoses, such as general ankle-hindfoot complaints,8 pending ankle or foot surgery10 and end-stage ankle osteoarthritis.7

    Recently, a Dutch version of the AOFAS Ankle-Hindfoot Scale became available.11 It was translated and culturally adapted to the Dutch population according to the guideline for Cross-Cultural Adaptation of Self-Report Measures.12 13 The AOFAS Ankle-Hindfoot Scale was shown to be valid, reliable and responsive in patients with an ankle fracture.11 Thus study aimed to determine the measurement properties of the AOFAS-Dutch language version (DLV) in patients who sustained a hindfoot fracture.

    Methods

    Study design and ethics statement

    This multicentre, prospective, observational study was performed at three hospitals. The study is registered at the Netherlands Trial Register (NTR5613). A detailed study protocol is published elsewhere.13 The Medical Research Ethics Committees or Local Ethics Boards of all participating centres approved the study.

    Patient recruitment

    Patients were recruited from 1 May 2014 to 1 November 2016. Patients were identified from hospital records, based on their International Coding of Diseases, 10th revision (ICD-10) code or Diagnosis-Related Group code. Inclusion criteria were: (1) unilateral hindfoot fracture; (2) age 18 years or older and (3) provision of informed consent. Exclusion criteria were: (1) multiple trauma affecting the outcome scores); (2) pathological fracture; (3) severe physical comorbidity (ie, American Society of Anaesthesiologists ≥3); (4) patient was non-ambulatory prior to the injury; (5) insufficient comprehension of the Dutch language and (6) expected problems of maintaining follow-up.

    A total of 118 individual patients were included; 78 completed t=1 and t=2, and 113 completed t=2 and t=3 (figure 1). Three patients were lost to follow-up during the course of the study.

    Figure 1

    Flowchart. The number of patients in each particular group is shown between square brackets. aPatients who participated in both groups. ASA, American Society of Anesthesiologists.

    The median age was 51 years (P25–P75 36–58) and the majority of patients (n=69; 61.1%) were men (table 1). The most common injuries were calcaneal fractures (n=82; 72.6%) and talar fractures (n=36; 31.9%). Fractures were mostly treated non-operatively (n=72; 73.6%).

    Table 1

    Demographic data for the study population

    Questionnaires and data collection

    Demographic, injury and treatment data were collected from the patient’s medical files. To complete the physician-reported part of the AOFAS Ankle-Hindfoot Scale-DLV, a research physician or research assistant performed the physical examination using a standardised protocol. Patients were asked to complete the AOFAS-DLV patient-reported part, Foot Function Index (FFI-DLV) and the Short Form Health Survey (SF-36-DLV) questionnaires on three occasions: between 3 and 6 months after trauma (t=1), 5–6 months later (t=2) and 2–3 weeks later (t=3). Patients were allowed to participate in both the responsiveness and test–retest part. A physician completed the physician-reported part of the AOFAS-DLV.

    The AOFAS Ankle-Hindfoot Scale consists of three subscales: pain, function and alignment and includes a total of nine items. The minimum score is 0 points (indicating severe pain and impairment), the maximum score is 100 points (no impairment).

    The FFI is a questionnaire, which focusses on disabilities and measures the impact of foot disorders. The FFI includes three subscales: pain, disability and activity limitations, which are spread over a total of 23 items. In this scoring system, a score of 0 points means ‘no disability’, 100 points implies the highest level of disability.14

    The SF-36 Health Survey is a generic measure of health status.15–22 It consists of 36 items, representing eight domains that are grouped into a physical component summary (PCS) and a mental component summary (MCS). All (sub)scales are normalised to a mean of 50 points with a SD of 10 points.

    Statistical analysis

    Statistical Package for Social Sciences (SPSS V.21) was used for analysis. Data are reported following the STrengthening the Reporting of OBservational studies in Epidemiology.23 Missing data were not imputed. Patient characteristics and questionnaire scores were analysed using descriptive statistics. Measurement properties of the AOFAS Ankle-Hindfoot Scale were determined in compliance with the COnsensus-based Standards for the selection of health Measurement Instruments guidelines.24 The already validated FFI and SF-36 (sub)scales were used to compare the AOFAS-DLV with. A summary of the measurement properties and statistical analysis is given in table 2. A more detailed description is published in the study protocol.13

    Table 2

    Overview of measurement properties and definitions used

    Supplementary file 1

    Results

    The changes over time in AOFAS-total, FFI-total, SF-36 PCS and SF-36 MCS are shown in figure 2. In the period from t=1 to t=2, the AOFAS, SF-36 PCS and (less pronounced) SF-36 MCS increased in scores. The FFI score decreases as expected, since this questionnaire focusses on disabilities. Scores at t=2 and t=3 were similar for all instruments.

    Figure 2

    AOFAS Ankle-Hindfoot (A), Foot Function Index (B), SF-36 PCS (C) and SF-36 MCS (D) scores at each follow-up visit in patients with an ankle fracture. AOFAS, American Orthopaedic Foot and Ankle Society; FFI, Foot Function Index; MCS, mental component summary; PCS, physical component summary; SF-36, Short Form-36.

    Floor and ceiling effects

    A floor effect was only present in the SF-36 RP and RE subscales at all follow-up moments. The percentage of patients reporting the minimum score varied between 52.6% (t=1) and 32.4% (t=3) for SF36 RP and between 25.6% (t=1) and 19.0% (t=3) for the SF36 RE subscale (figure 3A).

    Figure 3

    Floor effects (A) and ceiling effects (B) of the instruments used in patients with a hindfoot fracture. Out of a maximum of 78 at t=1, n=77 for AOFAS function and total, n=78 for AOFAS pain and alignment and for all (sub)scales of FFI and SF-36. Out of a maximum of 113 at t=2, n=109 for SF-36 GH, PCS and MCS, and n=110 for all (sub)scales of the AOFAS and FFI, and for all other subscales of the SF-36. Out of a maximum of 113 at t=3, n=105 for all (sub)scales of the AOFAS, FFI and SF-36. The dotted line represents the acceptable 15% of patients with the maximum score. Since for the SF-36 PCS and MCS none of the patients reported the worst or best possible score, they are not shown. AOFAS, American Orthopaedic Foot and Ankle Society; BP, bodily pain; FFI, Foot Function Index; GH, general health perceptions; MCS, mental component summary; MH, general mental health; PCS, physical component summary; PF, physical functioning; RE, role limitations due to emotional problems; RP, role limitations due to physical health; SF, social functioning; SF-36, Short Form-36; VT, vitality, energy or fatigue.

    Ceiling effects were seen in several (sub)scales, especially at longer follow-up (figure 3B). The AOFAS as a total scale only showed a ceiling effect at t=3; 16.2% of patients reported the maximum score. The AOFAS pain and alignment subscales had a ceiling effect from the t=1 onwards (12.8% and 62.8%, respectively). The AOFAS function subscale showed ceiling effects from t=2 onwards (22.7%). The FFI pain and disability subscales showed ceiling effects from t=2 onwards. The FFI limitation, SF-36 RP, SF and RE subscales showed ceiling effects at all follow-up moments.

    Reliability

    Internal consistency

    For the AOFAS total scale the Cronbach’s α was 0.585 (table 3). This may suggest inadequate internal consistency, but as the entire scale contains three subscales, this value should, however, be interpreted carefully. The Cronbach’s α for the AOFAS function subscale was 0.863, representing adequate internal consistency. Being single-item domains, Cronbach’s α could not be determined for the AOFAS pain and alignment subscales.

    Table 3

    Internal consistency of the instruments used in patients with a hindfoot fracture

    The FFI scale only showed adequate internal consistency for the subscale activity limitation (α=0.841). The internal consistency was not adequate for the FFI scale as a total (α=0.599) and for the subscales pain (α=0.653) and disability (α=0.558). For the total scale, this may be due to the fact that it is not unidimensional. Except for the subscale GH (α=0.627), all SF-36 (sub)scales showed adequate internal consistency.

    Construct validity

    Spearman’s rank correlations regarding construct validity are shown in table 4. Construct validity was only adequate for the AOFAS scale as a total and the function subscale, in both (sub)scales 82.4% of the predefined hypotheses were predicted correctly. For the pain subscale, only 8 out of 17 correlations (47.1%) were in accordance with predefined hypotheses. This was 12 (70.6%) for the alignment subscale. Both percentages were below the 75% threshold.

    Table 4

    Construct validity of the instruments in patients with a hindfoot fracture

    Reproducibility

    Test–retest reliability

    The intraclass correlation coefficient (ICC), indicating the reliability, of each (sub)scale is shown in table 5. The ICC for all AOFAS (sub)scales ranged from 0.89 to 0.97, indicating adequate test–retest reliability. For all FFI and SF-36 (sub)scales, the ICC was also adequate (>0.70).

    Table 5

    Intraclass correlation coefficient (ICC) and Bland-Altman analysis of the instruments in patients with a hindfoot fracture

    Agreement and smallest detectable change

    The level of agreement is indicated by the smallest detectable change (SDC) and the corresponding Reliable Change Index (RCI) (table 5). The SDC was 1.7 (RCI: 1.7%) for the AOFAS total scale, −2.9 (RCI: −2.9%) for the FFI total scale, 0.16 (RCI: 0.2%) for the SF-36 PCS subscale and −0.29 (RCI: −0.4%) for the SF-36 MCS subscale.

    The Bland and Altman analysis shows that for each (sub)scale the 95% limits of agreement for the mean change in scores contains zero; this confirms that there is no bias in measurements (figure 4 and table 5).

    Figure 4

    Bland-Altman plots for AOFAS Ankle-Hindfoot (A), Foot Function Index (B), SF-36 PCS (C) and SF-36 MCS (D) scores in patients with a hindfoot fracture. Change scores were calculated from t=2 to t=3. Each dot represents a single patient. The black line indicates the mean difference. The upper and lower edges of the grey box are the 95% limits of agreement. AOFAS, American Orthopaedic Foot and Ankle Society; FFI, Foot Function Index; MCS, mental component summary; PCS, physical component summary; SF-36, Short Form-36.

    Responsiveness

    Spearman’s rank correlation coefficients for longitudinal validity are shown in table 6. Longitudinal validity was adequate for the AOFAS pain and alignment subscale; out of 17 correlations, 15 (88.2%) were in line with predefined hypotheses for the pain subscale and 17 (100.0%) for the AOFAS alignment subscale. Longitudinal validity was not sufficient for the function subscale (10/17; 58.8%) and for the total scale (12/17; 70.6%).

    Table 6

    Longitudinal validity of the instruments in patients with a hindfoot fracture

    The standardised response mean (SRM) and the effect size (ES) of the instruments are shown in table 7. The magnitude of change was large for the AOFAS total scale (SRM 0.79, ES 0.63) and moderate for the function subscale (SRM 0.94, ES 0.61). The ES were small for the one-item subscales pain (SRM 0.26) and alignment (SRM 0.06).

    Table 7

    Responsiveness: standardised response mean (SRM) and effect size (ES) of the instruments in patients with a hindfoot fracture

    Discussion

    The results of this study showed that the AOFAS Ankle-Hindfoot Scale (AOFAS-DLV) has adequate construct validity and is reliable for measuring functional outcome in patients with a hindfoot fracture. However, longitudinal validity and responsiveness were inadequate in the study population.

    Floor effects were not present for the AOFAS-DLV, but all (sub)scales showed an increasing ceiling effect over time. That suggests that an increasing number of patients achieved full recovery over time. This is in line with previous findings.11 17 The single-item subscales pain and alignment showed a ceiling effect from t=1 onwards. This could be due to the fact that (minor) extra-articular fractures may not be an issue with alignment. The high rate of operative treatment may also have improved alignment, especially for the intra-articular fractures. Alternatively, the limited answers for the pain and alignment subscales and the choice of administering the AOFAS-DLV at 3–6 months after trauma for the first time, may also have contributed to the ceiling effects.

    Adequate construct validity of the AOFAS total scale and function subscale is also in correspondence with previous research.10 11 The AOFAS subscales pain and alignment did not show adequate construct validity, in contrast with earlier data in ankle fractures.11 The AOFAS pain and alignment subscales consist only of one item. In the hindfoot series, the correlations with other (sub)scales were generally overestimated for the pain subscale and underestimated for the alignment subscale. This difference is unlikely due to the (heterogeneity) in (sub)scale scores between the ankle and hindfoot fracture cohorts. There is also no clear pathophysiological explanation for this difference, other than the fact that hindfoot and ankle fractures are different injuries. Another possible explanation may be a difference in follow-up moment used for hindfoot and ankle fractures.

    With a Cronbach’s α above 0.7, internal consistency of the AOFAS-DLV function subscale was adequate. For the total scale, this remains inconclusive; the Cronbach’s α of 0.585 should be interpreted carefully as the total scale is not unidimensional. In ankle fractures,11 ankle sprains25 and ankle arthroplasty and arthrodesis,26 the Cronbach’s α for the total scale ranged from 0.92 to 0.95. To our knowledge, no recent literature on this topic is available for hindfoot fractures. Deleting the pain question increases Cronbach’s α to 0.843 (data not shown). This may suggest that the pain question is difficult to answer for patients. This could be due to the fact that three out of four answers combine pain severity and frequency. Such linguistic issues have been noted before.26 27

    The ICC values between 0.89 and 0.97 confirm adequate test–retest reliability of the AOFAS-DLV total scale and all subscales. Similar ICCs (ranging from 0.89 to 0.95) were found for the Turkish and Portuguese version of the AOFAS Ankle-Hindfoot Scale in patients with foot and ankle disorders.11 28 29

    Responsiveness is a product of magnitude of change and longitudinal validity. The longitudinal validity of the AOFAS subscales pain and alignment was adequate (ie, >75% of the hypothesised correlations predicted correctly). However, the AOFAS subscale function and the total scale were not proven adequate, as only 58.8% and 70.6% of the predefined hypothesis were confirmed, respectively. The inadequate longitudinal validity makes the AOFAS-DLV less useful for longitudinal studies measuring recovery over time in patients with a hindfoot fracture. Longitudinal validity was adequate for all (sub)scales of the AOFAS-DLV in patients with ankle fractures in previous research.11 In the hindfoot series, the correlations of the difference in score between t=1 and t=2 with other (sub)scales were generally overestimated for the AOFAS function subscale and total scale. Similar as for the construct validity, there is no clear pathophysiological explanation for this difference, other than the difference in (severity of) the injuries and follow-up moments used.

    The magnitude of change was moderate for the AOFAS Ankle-Hindfoot scale DLV as a total, with a SRM of 0.79 and an ES of 0.63. This is comparable to the magnitude of change for the total FFI (SRM 0.89, ES 0.60) and the SF-36 subscales PCS, PF and RP as in our recent study on ankle fractures.11 Previous data for hindfoot injuries are not available.

    The Bland and Altman analysis confirmed absence of systematic bias for repeated recordings of the AOFAS Ankle-Hindfoot Scale-DLV. With an SDC of 1.7 points, the measurement error is very small. This measurement error was lower than reported for a variety of foot and ankle disorders in the Turkish population (SDC 13.3) and for ankle fractures in the Dutch population (SDC 12.0).11 28

    Conclusion

    The AOFAS Ankle-Hindfoot Scale DLV has adequate construct validity and is reliable, making it a suitable instrument for cross-sectional studies investigating functional outcome in patients with a hindfoot fracture. The inadequate longitudinal validity and responsiveness, however, hamper the use of the questionnaire in longitudinal studies and for assessing long-term functional outcome.

    References

    1. 1.
    2. 2.
    3. 3.
    4. 4.
    5. 5.
    6. 6.
    7. 7.
    8. 8.
    9. 9.
    10. 10.
    11. 11.
    12. 12.
    13. 13.
    14. 14.
    15. 15.
    16. 16.
    17. 17.
    18. 18.
    19. 19.
    20. 20.
    21. 21.
    22. 22.
    23. 23.
    24. 24.
    25. 25.
    26. 26.
    27. 27.
    28. 28.
    29. 29.
    30. 30.
    31. 31.
    32. 32.
    33. 33.
    34. 34.
    35. 35.
    36. 36.

    Footnotes

    • Contributors EMMVL, ASDB, DEM, CHVdV, PTDH, WET and MHJV developed the study. ASDB and EMMVL drafted the manuscript. EMMVL acted as trial principal investigator. ASDB, CHVdV, PTDH, DEM and MHJV participated inpatient inclusion and outcome assessment. ASDB, WET and EMMVL performed statistical analysis of the study data. All authors have read and approved the final manuscript.

    • Competing interests None declared.

    • Patient consent Obtained.

    • Ethics approval This study has been exempted by the medical research ethics committee (MREC) Erasmus MC (Rotterdam, The Netherlands). Each participant provided written consent to participate and remained anonymised during the study. The study is registered at the Netherlands Trial Register (NTR5613; 05 Jan 2016).

    • Provenance and peer review Not commissioned; externally peer reviewed.

    • Data sharing statement All data are processed in this manuscript. There are no further unpublished data from this study available.

    • Collaborators D A Newhall, J Romeo, R J C Tjioe, F Van der Sijde, E N Van der Velden–Macauley, L Vellekoop