Objective To predict functional outcomes 6 months after ankle fracture in people aged ≥60 years using post-treatment and 6-week follow-up data to inform anticipated recovery, and identify people who may benefit from additional monitoring or rehabilitation.
Design Prognostic model development and internal validation.
Setting 24 National Health Service hospitals, UK.
Methods Participants were the Ankle Injury Management clinical trial cohort (n=618) (ISRCTN04180738), aged 60–96 years, 459/618 (74%) female, treated surgically or conservatively for unstable ankle fracture. Predictors were injury and sociodemographic variables collected at baseline (acute hospital setting) and 6-week follow-up (clinic). Outcome measures were 6-month postinjury (primary) self-reported ankle function, using the Olerud and Molander Ankle Score (OMAS), and (secondary) Timed Up and Go (TUG) test by blinded assessor. Missing data were managed with single imputation. Multivariable linear regression models were built to predict OMAS or TUG, using baseline variables or baseline and 6-week follow-up variables. Models were internally validated using bootstrapping.
Results The OMAS baseline data model included: alcohol per week (units), postinjury EQ-5D-3L visual analogue scale (VAS), sex, preinjury walking distance and walking aid use, smoking status and perceived health status. The baseline/6-week data model included the same baseline variables, minus EQ-5D-3L VAS, plus five 6-week predictors: radiological malalignment, injured ankle dorsiflexion and plantarflexion range of motion, and 6-week OMAS and EQ-5D-3L. The models explained approximately 23% and 26% of the outcome variation, respectively. Similar baseline and baseline/6 week data models to predict TUG explained around 30% and 32% of the outcome variation, respectively.
Conclusions Predictive accuracy of the prognostic models using commonly recorded clinical data to predict self-reported or objectively measured ankle function was relatively low and therefore unlikely to be beneficial for clinical practice and counselling of patients. Other potential predictors (eg, psychological factors such as catastrophising and fear avoidance) should be investigated to improve predictive accuracy.
Trial registration number ISRCTN04180738; Post-results.
- ankle fractures
- ankle injuries
- osteoporotic fractures
This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
Developed and internally validated prognostic models using robust methods.
The Ankle Injury Management trial was a pragmatic study to enhance generalisability and had very low levels of missing data and is reflective of data that are or could be routinely collected during acute hospital admission and clinic follow-up.
Use of an existing clinical trial dataset for developing the prognostic model restricted the choice of potential prognostic factors to those collected in the trial.
Ankle fractures in older people are increasing in number as the population ages.1 Older people have a worse prognosis than younger adults for recovering function after fractures.2 However, there is limited evidence on which injury, treatment and sociodemographic factors predict functional outcomes after ankle fracture in older adults. Previous studies have used small cohorts and few predictive factors.3–5
A multicentre randomised clinical trial of close contact casting versus surgery included 620 people aged 60 years and over with an unstable ankle fracture and found equivalence in outcomes between the treatment groups.6 7 Irrespective of the initial fracture treatment, considerable outcome variation was evident in the trial cohort, highlighting the value of investigating which combination of socio-demographic and clinical prognostic factors could predict functional outcomes in this patient group. A prognostic model could inform patient counselling about prognosis and identify people who might benefit from additional monitoring or rehabilitation.
Develop and internally validate a prognostic model to predict (1) patient-reported and (2) objectively measured functional outcome 6 months after ankle fracture in people aged 60 years or over using sociodemographic and clinical data collected in the acute phase and at 6-week follow-up.
Study design and setting
Data from the Ankle Injury Management (AIM) trial cohort (registration: ISRCTN04180738) were used to develop and internally validate prognostic models. AIM was a pragmatic, multicentre, randomised equivalence trial and economic evaluation comparing close contact casting with open surgical reduction and internal fixation in the treatment of unstable ankle fractures in people aged over 60 years. The trial methods and results have been published elsewhere.6 7 Participants were recruited from May 2004 in a pilot centre, then from 24 centres (general hospitals and major trauma centres) in the UK from July 2010 to November 2013. The primary endpoint was 6 month follow-up. All participants gave written informed consent for data to be used.
Eligible participants were aged 60 years or over and had an acute, overtly unstable ankle fracture (displaced or clinically unstable). Participants with insulin-dependent diabetes mellitus, active leg ulceration, critical limb ischaemia, open fracture, serious concomitant disease, substantial cognitive impairment or ankle arthritis, or who were not fit for anaesthesia, were excluded. Participants were allocated to usual care (open reduction and internal fixation surgery) or a minimally padded close contact cast. Both treatments were conducted under anaesthesia by an orthopaedic surgeon.
The primary outcome to be predicted was ankle function, measured using the Olerud and Molander Ankle Score (OMAS)8 6 months after fracture. The OMAS is a widely used questionnaire to assess outcomes after ankle fracture, covering a range of symptoms and mobility limitations. It is measured on a 0–100 scale, with lower scores representing worse ankle function. Participants reported their OMAS at follow-up clinics or, if unable to attend, over the telephone. If the participant did not have sufficient dexterity to complete the questionnaire independently the researcher acted as scribe.
The secondary outcome to be predicted was the Timed Up and Go (TUG) test,9 an objective measurement of mobility conducted by a blinded outcome assessor. Participants were timed standing up from a chair, walking 8.6 m, turning and returning to sit in the start position. Participants’ ankles had a dressing applied to obscure the presence of lack of surgical incision scars. The TUG test is responsive, valid and reliable assessment of mobility in older adults that has been shown to be predictive of falls risk and functional decline.10–12
Predictor variables were selected based on clinical rationale and availability in the clinical trial dataset (table 1). Two models were produced for each outcome to investigate whether predictor variables collected at 6-week follow-up clinics improved the baseline prediction. Predictor measurements were obtained from participant questionnaires or clinical assessments. Two experienced orthopaedic surgeons with no access to the clinical data assessed fracture misalignment at 6 weeks on anteroposterior or mortise and lateral radiographs.
Sample size was constrained to the size of the AIM trial cohort as this was a pre-existing dataset. The number of variables was limited to those plausibly related the outcome. The OMAS and TUG multivariable models were built using data from 620 participants. We initially assessed the covariates by plotting scatter graphs of outcomes and each continuous covariate and comparing each covariate with another. If highly correlated predictors were found, only one was included in the multivariable modelling. Any predictors or participants with >90% missing data were excluded. Remaining missing data were dealt with using single imputation before building the model.
As the outcomes were continuous, the prognostic models were developed under a linear regression modelling framework. The predictors (table 1) were included as independent variables in the model. Backwards elimination was performed to derive the models using the Akaike information criterion (AIC).
Categorical variables were collapsed into binary variables to ensure sufficient data points in each subcategory. The use of an assistive device for walking before injury was collapsed into whether the patient used assistance (yes: one stick, two sticks, frame/rollator, or wheelchair) or not (no). Walking distance before injury was split into <0.5 miles and >0.5 miles, reflecting the person’s walking endurance. Home support was split into lives alone/with someone or lives with carer/has external care support. Continuous variables were investigated for non-linearity with the outcome using multivariable fractional polynomials.
Clinically plausible interactions were examined for inclusion in the model. Model performance was assessed by calculating the adjusted R2.
The prognostic models were internally validated through bootstrapping using 200 bootstrap samples (replaying all variable selection procedures) to correct for optimism and quantify and adjust the model for overfitting.13 All analyses were conducted using Stata V.14.2. We followed the TRIPOD statement when reporting this study.14
Patient and public involvement
Patient and public involvement (PPI) representatives provided feedback on the research proposal in the planning stages of this study. A PPI representative was an independent member of the AIM trial steering committee.
The dataset contained 620 participants. Participants were aged median 70 (IQR 65–76) years and 459/618 (74%) were female. Tables 2 and 3 list the participant characteristics considered for building the models, the median (IQR) values for continuous variables, and the median OMAS and TUG values for categorical variables.
Two participants had >90% missing data and were omitted from the analysis. Most predictors did not exceed 16% missing data. Most missing data occurred in the baseline EQ-5D-3L postinjury score. There were complete baseline, 6-week and outcome (OMAS and TUG) data for 434/620 (70%) participants.
Predicting patient-reported functional outcome 6 months after ankle fracture
OMAS baseline model
Table 4 shows the eight baseline variables associated with 6 month OMAS and selected for the OMAS baseline model: preinjury OMAS, alcohol consumed per week (units), postinjury EQ-5D-3L visual analogue scale (VAS), sex, walking distance, walking aid, smoking status and health status. For instance, if all variables were kept constant, the 6-month OMAS for women was on average approximately 7 points lower than for equivalent men (p<0.001, 6-month OMAS median (IQR), 70 (50–80)). A 10-point difference in the preinjury OMAS for two identical individuals led to an average difference of approximately 2 points in the 6-month OMAS. The units of alcohol consumed in a week and the postinjury EQ-5D-3L VAS had marginal (p=0.043) and non-influential (p=0.102) effects on the 6-month OMAS, respectively. People who could walk further than 0.5 miles before injury had an approximately 8-point higher OMAS than those who could not, and people who required a walking aid before injury were more likely to have an approximately 8-point lower OMAS.
The initial model containing all candidate predictors had an R2 value of 0.25. The final model had an R2 value of 0.238 and an adjusted R2 of 0.223: the model’s predictors explained approximately 22% of the outcome variation.
OMAS baseline model internal validation
After bootstrapping, the optimism-corrected R2 performance estimate was 0.228. The model’s performance on the internal validation was similar to that achieved on the original dataset, indicating no evidence for any overfitting.
OMAS baseline/6-week model
Adding the 6-week follow-up variables produced a model similar to the baseline model, minus postinjury EQ-5D-3L VAS, plus five 6-week follow-up variables: presence of radiological malalignment, range of motion in the injured ankle for dorsiflexion and plantar flexion, 6-week OMAS and 6-week EQ-5D-3L (table 4). A patient with radiological malalignment at 6 weeks would, on average, have a 6-month OMAS approximately 4 points lower than a patient without malalignment. Keeping all other variables constant, 20° more ankle motion in both dorsiflexion and plantar flexion at 6 weeks would result in an approximately 4-point higher 6-month OMAS. The higher the 6-week OMAS and EQ-5D-3L, the greater the 6-month OMAS (p<0.001 and p=0.002, respectively).
The baseline/6-week model had an adjusted R2 value of 0.261 which was higher than the baseline model’s adjusted R2. Including the 6-week follow-up variables explained around 4% more variation than the baseline model.
OMAS baseline/6-week model internal validation
After bootstrapping, the optimism-corrected R2 performance estimate was 0.264, similar to that on the original dataset. Approximately 26% of the variation in the 6 month OMAS was explained by the model, an increase of 3% from the baseline linear multivariable model.
In an exploratory analysis, a model using only the 6-week variables was developed. The adjusted R2 value was 0.123, which was significantly lower than the model performance for the baseline model.
Predicting objectively measured functional outcome 6 months after ankle fracture
TUG baseline model
Table 5 shows the baseline variables associated with the TUG time at 6 months and included in the baseline TUG model. On average women had around 6 s (p<0.001) longer TUG times than men (6-month TUG median (IQR), 18 s (14 to23)). Holding all other variables constant, a 10-point increase in the mini-mental state examination score led to a 2 s decrease in the predicted TUG time (p<0.001). A 10-year age difference between two otherwise identical individuals resulted in an 8 s longer predicted TUG time for the older individual (p<0.001). Those with higher postinjury EQ-5D-3L VAS were more likely to have a quicker TUG time at 6 months (p<0.001). People able to walk further than 0.5 mile before injury had a TUG time approximately 20 s quicker than an equivalent person who could only walk less than 0.5 mile (p<0.001). People who required a walking aid before injury were 19 s slower, on average, than those who did not. Fracture pattern and preinjury OMAS predicted the TUG score well: a Weber C fracture pattern and low preinjury OMAS predicted a longer TUG time.
The initial model containing all possible variables had an R2 value of 0.321, indicating that all of the variables chosen for consideration in model building together explained 32% of the variation in the TUG outcome. The final baseline model had an R2 value of 0.308 and an adjusted R2 of 0.298, indicating that the model’s predictors explained approximately 30% of the outcome variation.
TUG baseline model internal validation
After bootstrapping, the optimism-corrected R2 performance estimate was 0.293. The model’s performance on the internal validation was very similar to that achieved on the original dataset, explaining approximately 30% of the variation in the 6 month TUG score.
TUG baseline/6-week model
When the 6-week variables were included in building a model to predict the 6-month TUG time, four baseline variables were removed. The model-building process selected another baseline variable, home support, for inclusion in their place. Its inclusion led to spurious results as there were little data in one of the subcategories (lives with carer/has external care support). The variable was therefore removed from the final model. The final model also included three 6-week variables: readmission, 6-week EQ-5D-3L and 6-week EQ-5D-3L VAS (table 5).
If a person was readmitted in the first 6 weeks after injury, they were predicted to have a 4 s faster TUG time at 6 months (p=0.04), if all other variables were held constant. Those with a better EQ-5D-3L score at 6 weeks were more likely to have a quicker TUG (p<0.001). If all other variables were kept constant, an increase of 20 points in the 6-week VAS resulted in a 1.5 s quicker time (p=0.04).
The baseline/6-week data model had a greater adjusted R2 (0.321) than the baseline data model (0.308). The model now explained 32% of the variability in TUG at 6 months, 2% more than that explained by the baseline data model.
TUG baseline/6-week model internal validation
After bootstrapping, optimism-corrected R2 performance estimate was 0.314. The model’s performance on the internal validation dataset was very similar to that achieved on the original dataset, explaining approximately 31% of the variation in the 6 month TUG time.
In an exploratory analysis, a model using only the 6 week variables was developed. The adjusted R2 value was 0.109, which was lower than the model performance for the baseline model.
By using single imputation for the missing values in the 6-month OMAS and TUG outcomes, we assumed that the missing values occurred at random, so that variables in the dataset could predict the occurrence of missing values. To check how sensitive the four models were to this assumption, we conducted a complete-case analysis, which assumes that missing values occur completely at random.
Twenty-six participants were omitted from the OMAS models, leaving 592 participants. The same variables were selected for the baseline data model. The same variables were selected for the baseline/6 week data model, plus postinjury EQ5D-3L VAS and ‘started partial weight bearing’. Both models had marginally lower adjusted R2 values: 0.205 for the baseline data model, 2% lower than the original model, and 0.256 for the baseline/6-week data model, 0.5% lower than the original model. After bootstrapping, the optimism-corrected R2 performance estimates of 0.215 for the baseline data model and 0.268 for the baseline/6-week data model.
Sixty-eight participants were omitted from the TUG models, leaving 550 participants. The baseline data model included the same variables as before, but replaced baseline postinjury EQ-5D-3L VAS with baseline recall preinjury EQ-5D-3L VAS and omitted preinjury OMAS. The baseline/6-week data model included the same variables as before, plus fracture pattern and walking aid. Both models had lower adjusted R2 values: 0.222 for the baseline data model, 8% lower than the original model, and 0.264 for the baseline/6-week data model, 6% lower than the original model. After bootstrapping, the optimism-corrected R2 performance estimates were 0.215 for the baseline data model and 0.253 for the baseline/6-week data model.
A prognostic model has the potential to inform anticipated recovery, and identify people who may benefit from additional monitoring or rehabilitation. We developed and internally validated four prognostic models for 6-month outcomes after ankle fracture using recommended methods13 and a wide range of regularly collected data on plausible prognostic factors from hospital admission and at 6 week follow-up clinics. The OMAS baseline data model included: alcohol per week (units), postinjury EQ-5D-3L VAS, sex, preinjury walking distance and walking aid use, smoking status and perceived health status. The baseline/6-week data model included the same baseline variables, minus EQ-5D-3L VAS, plus five 6-week predictors: radiological malalignment, injured ankle dorsiflexion and plantarflexion range of motion and 6-week OMAS and EQ-5D-3L. The models explained approximately 23% and 26% of the outcome variation, respectively. Similar baseline and baseline/6-week data models to predict TUG explained around 30% and 32% of the outcome variation, respectively. Adding 6-week follow-up variables to baseline variables provided only a modest benefit. This benefit arguably does not outweigh the logistical issues of obtaining patient information at 6 weeks to improve prediction accuracy. The models performed similarly on the development dataset and bootstrapped internal validation datasets.
Predictive accuracy of the four models using commonly recorded clinical data to predict self-reported or objectively measured ankle function 6 months after unstable ankle fracture in adults aged over 60 years was relatively low. As there are limitations in predictive performance, we do not recommend using these prognostic models in clinical practice as decision-making tools in isolation.
The variables of most predictive value across the prognostic models were sex, with females having worse outcomes than males. A cohort of 584 severe ankle sprain participants also found that females had worse outcomes.15 Self-reported preinjury walking distance and preinjury walking aid use was also predictive of outcome, with those able to walk less than 0.5 miles doing and using a walking aid faring worse than those able to walk further and not needing aids. These factors are features of declining locomotor function and are likely to be related to a greater levels of frailty preinjury.16
The inferences presented here cannot be extended to propose causal relationships between the predictors and outcomes, as we did not consider issues such as confounding. As the aim and design of this study prioritised optimising prognostic accuracy, predictors were considered without considering whether they could form a causal pathway or not.17
Comparisons with other studies that have developed prognostic models for people after ankle fracture are challenging as, to the best of our knowledge, this is the first study to focus on older adults, use a larger cohort, and internally validate the developed models to adjust for optimism. A two-centre observational study of 60 adults (mean age 49) reported that ankle dorsiflexion and fracture classification (number of malleoli injured) predicted OMAS 6 months after cast removal (R2=0.4).3 In a larger cohort of 150 adults (mean age 46), Lin and colleagues7 developed a model to predict patient-reported lower limb function (Lower Extremity Functional Scale)18 at 4 and 12 weeks after cast removal. The models were not internally validated, but were externally validated in a separate cohort of 94 participants. Eight predictors were examined. Pain and dorsiflexion after cast removal were included in the final model, which explained about 15% of the outcome variability in the development stage and even less in the external validation.
The lack of predictive power of both our models and those in the literature indicate that functional outcomes after ankle fracture may be difficult to predict, or other prognostic factors may need to be considered to improve predictions of functional outcome. Our models may be missing informative predictors that were not captured in the AIM trial dataset. More detailed injury characteristics, such as whether fractures were uni-malleolar, bi-malleolar or tri-malleolar, were not explored. The extent of articular damage assessed by specialist imaging could provide useful prognostic information. However, any techniques using specialist equipment or personnel would hamper clinical utility for routine use and would be difficult to implement if there were resource implications. Further research is recommended to investigate whether a wider range of psychosocial and environmental factors19 can enhance predictive accuracy. There is preliminary evidence that catastrophising and fear avoidance are potential psychological prognostic factors that warrant further investigation in people after ankle fracture.20
Similar results were found for the main and sensitivity analyses of both the OMAS and TUG models. The assumption that data were missing at random was therefore plausible and conducting single imputation before model building was probably appropriate.
This study was limited by the use of an existing clinical trial dataset for developing the prognostic model, as the choice of potential prognostic factors were restricted to those collected in the trial. Clinical trial cohorts are usually more selective than the wider clinical population. However, the AIM trial was a pragmatic study to enhance external validity. The use of a clinical trial cohort to develop prognostic models is also not uncommon, and has resulted in robust prognostic models for other ankle injury populations that have been successfully externally validated.21 Strengths of using this clinical trial dataset were that it had very low levels of missing data and that it reflected data that are or could be routinely collected during acute hospital admission and clinic follow-up.
We developed and internally validated prognostic models to predict functional outcomes 6 months after unstable ankle fracture in older adults using commonly recorded clinical data. These prognostic models had relatively limited accuracy in predicting self-reported or objectively measured ankle function. Other potential predictors (eg, psychological factors such as catastrophising and fear avoidance) should be investigated.
We thank the AIM trial participants and collaborators. We acknowledge Jennifer A. de Beyer of the Centre for Statistics in Medicine, University of Oxford, for English language editing and William Soanes for contributing to the statistical analysis plan.
Contributors DJK devised the study, obtained and interpreted the data and led writing the manuscript. KV analysed and interpreted the data. KW obtained the data and contributed to the study design and interpretation. DM contributed to data management and analysis. MLC contributed to the study design and data interpretation. GSC oversaw the analysis plan development and the conduct of the analysis. SEL obtained the data and contributed to the study design and interpretation. All authors critically reviewed the manuscript.
Funding This report is independent research supported by the National Institute for Health Research (NIHR Post Doctoral Fellowship, Dr David Keene, PDF-2016-09-056). The report was supported by the NIHR Biomedical Research Centre, Oxford. Professor Lamb receives funding from the NIHR Collaboration for Leadership in Applied Health Research and Care Oxford at Oxford Health NHS Foundation Trust.
Disclaimer The views expressed in this publication are those of the authors and not necessarily those of the NHS, the National Institute for Health Research or the Department of Health and Social Care.
Competing interests None declared.
Ethics approval The National Research Ethics Service Oxfordshire Committee approved use of data. All participants gave written informed consent for data to be used.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement All data requests should be submitted to the corresponding author for consideration. Access to anonymised data may be granted following review. Exclusive use will be retained until the publication of major outputs.
Patient consent for publication Not required.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.