Objectives Hospital-acquired acute kidney injury (HA-AKI) is associated with a high risk of mortality. Prediction models or rules may identify those most at risk of HA-AKI. This study externally validated one of the few clinical prediction rules (CPRs) derived in a general medicine cohort using clinical information and data from an acute hospitals electronic system on admission: the acute kidney injury prediction score (APS).
Design, setting and participants External validation in a single UK non-specialist acute hospital (2013–2015, 12 554 episodes); four cohorts: adult medical and general surgical populations, with and without a known preadmission baseline serum creatinine (SCr).
Methods Performance assessed by discrimination using area under the receiver operating characteristic curves (AUCROC) and calibration.
Results HA-AKI incidence within 7 days (kidney disease: improving global outcomes (KDIGO) change in SCr) was 8.1% (n=409) of medical patients with known baseline SCr, 6.6% (n=141) in those without a baseline, 4.9% (n=204) in surgical patients with baseline and 4% (n=49) in those without. Across the four cohorts AUCROC were: medical with known baseline 0.65 (95% CIs 0.62 to 0.67) and no baseline 0.71 (0.67 to 0.75), surgical with baseline 0.66 (0.62 to 0.70) and no baseline 0.68 (0.58 to 0.75). For calibration, in medicine and surgical cohorts with baseline SCr, Hosmer-Lemeshow p values were non-significant, suggesting acceptable calibration. In the medical cohort, at a cut-off of five points on the APS to predict HA-AKI, positive predictive value was 16% (13–18%) and negative predictive value 94% (93–94%). Of medical patients with HA-AKI, those with an APS ≥5 had a significantly increased risk of death (28% vs 18%, OR 1.8 (95% CI 1.1 to 2.9), p=0.015).
Conclusions On external validation the APS on admission shows moderate discrimination and acceptable calibration to predict HA-AKI and may be useful as a severity marker when HA-AKI occurs. Harnessing linked data from primary care may be one way to achieve more accurate risk prediction.
Statistics from Altmetric.com
Strengths and limitations of this study
This large study is one of the few external validations of an acute kidney injury (AKI) prediction rule in general medical and surgical hospital populations, including patients without a baseline serum creatinine (SCr).
Stringent inclusion criteria (stay at least one night, first admission, repeat SCr performed), excluded a large proportion of ‘low-risk’ patients, recognised as such by the clinical team.
Single-centre cohort, in the same geographical area to the derivation site cautions against generalisability.
Medical histories relied on previously coded events on the hospital database and are thus likely to underestimate disease prevalence.
Reliance on a baseline SCr that could have last been performed a number of months ago, means that a proportion of the patients considered to have hospital-acquired AKI may have already fulfilled the criteria for a change in SCr prior to admission, that is, have community-acquired AKI.
Acute kidney injury (AKI) is a clinically diagnosed syndrome defined as an acute increase in serum creatinine (SCr) and/or a reduction in urine volume.1 One in five adults worldwide experience AKI during a hospital admission and despite significant developments in hospital care, mortality associated with AKI remains high.2–7 Deficits in the recognition and subsequent management of patients who have developed AKI have been shown,8 ,9 with recent studies suggest around 60–79% of AKI cases are from the community and can be flagged at admission.10–15 However, a significant proportion of AKI develops in hospital (hospital-acquired AKI, HA-AKI) and systematically highlighting patients at the highest risk of HA-AKI is of great interest.16 ,17 Given that a continuum of injury exists before loss of excretory kidney function can be measured with standard laboratory tests (ie, SCr), one method that could be used to highlight patients at an earlier stage before injury is the use of prediction models.18–21 Most prediction research involves derivation and internal validation, however, external validation is crucial to address overfitting and generalisability.22–26 The majority of HA-AKI prediction models focus on high-risk specialist groups such as cardiac surgery with few studies in acute medicine or surgery.15 ,27–29 One such model is the acute kidney injury prediction score (APS), derived in acute medicine patients in a single UK centre which uses a combination of comorbidities and acute physiological variables derived from a hospital electronic data system (see online supplementary appendix table A1 for variables). Discrimination assessed by the area under the receiver operating characteristic curve (AUROC) to predict HA-AKI within 7 days was 0.72 in derivation and 0.76 in an internal validation cohort (without a baseline SCr).15 The primary aim of this study was to externally validate the APS in a general medical population with a known baseline SCr. The presence of a known baseline accurately accounts for patients with chronic kidney disease (CKD) and therefore allows more confident conclusions to be drawn as to whether acute deterioration in renal function has already occurred at admission (community-acquired AKI, CA-AKI). Secondary aims were validations of the APS in (1) a general medical population without a known baseline SCr and (2) general surgical populations again with and without a known baseline SCr. Published guidance for reporting were followed.22
A retrospective observational cohort external validation study of the APS (see online supplementary appendix table A1 for variables and weightings) was performed on the adult medical and surgical units of St Richard's Hospital site of Western Sussex Hospitals NHS Foundation Trust (WSHFT), for the period March 2013 to February 2015. WSHFT is an 870-bed Trust on the South Coast of England, with a combined annual emergency department attendance over 150 000 and 60–80 acute medical or surgical admissions per 24-hour period. St Richard's is a separate hospital site from Worthing hospital where in 2011 the APS was derived on general medical patients. There was no cross-site contamination of staff, though catchment populations for the two sites, 20 miles apart, are similar. Surgery at the Trust includes general surgery, urology and trauma and orthopaedic, but does not include major trauma, neurosurgery, cardiac, major vascular or transplantation. (see online supplementary appendix table A2 for clinical and demographic values for the derivation and presented external validation cohorts).15
At admission, all inpatients (acute medicine and elderly care, emergency and elective surgery) routinely had physiological observations measured and entered via handheld systems into the clinical data software system (Patientrack Sydney, NSW, Australia). Previous International Statistical Classification of Diseases 10th revision (ICD-10) electronically coded history (heart failure, liver disease and diabetes mellitus) were retrieved and CKD was defined as an estimated glomerular filtration rate (eGFR)<60 mLs/min prior to admission (in those with an available baseline SCr) using a National algorithm that has been shown to perform well.30 Kidney disease: improving global outcomes (KDIGO) criteria for AKI was employed (SCr increase of ≥1.5 from the admission value or ≥26.5 μmol/L within a rolling 48 hours during the first 7 days of the patients first admission in the study period). Patients were included if they were ≥18 years of age, stayed at least one night in the medical or surgical division over the 24-month period (2013–2015) and had at least one SCr repeated.
patients with AKI on admission (defined using KDIGO change from baseline SCr or absolute SCr value ≥354 μmol/L);
patients moved directly from the accident and emergency department (A&E) department to the intensive care unit (ICU) (as neither area uses the Patientrack data system);
obstetrics and gynaecology admissions;
discharged without spending a night in hospital.
Patients were followed-up until discharge from hospital, or death, during the inpatient spell. The analysis was performed for the patients' first admission where more than one occurred in the study period. As all patients admitted had an APS calculated automatically by the Patientrack© system before any outcome had occurred, there were no missing demographic or physiological data at admission. As history relied on previous coded events (from attending hospital), potentially such data could have been missing. None of the researchers involved in analysis of the data were involved in the management of the patients. The research team members responsible for data analysis had access only to the fully anonymised individual-level data and were blinded to other patient data, as well as to the components of the calculated scores in the hospital information system. Since there is no consensus on how to determine what counts as an adequate sample size in such studies, all available (19 276) hospital cases for the period 2013–2015 were included in the analysis.22
Following logistic regression analysis, discrimination was assessed by the area under the receiver operating characteristic curve (AUROC),22 and calibration by the Hosmer-Lemeshow (H-L) test, where a significant p<0.05, may indicate poor fit of a rule or model, though this has limitations in larger datasets.31 ,32 Calibration was additionally assessed graphically by plotting predicted probabilities (x-axis) against the observed event rate (y-axis) of the outcome and deriving a linear function (an intercept of zero and a slope of one indicating perfect calibration).22 Predictive values and likelihood ratios were calculated to further inform on the way model performance could impact on clinical workload. Following extraction, all data were fully anonymised on Microsoft Excel and analyses performed on SPSS V.22.
From an initial 19 276 patient episodes in medicine (including elderly care) and surgery (see figure 1) over a 2-year period 12 554 episodes were analysed (n=7170 in medicine and n=5384 in surgery) after excluding patients with AKI on admission (n=782) or with no repeat SCr while in hospital (n=5940). Over a quarter of patients (n=3329) had no baseline SCr. Table 1 summarises the four groups in terms of clinical and demographic data. Incidence of HA-AKI was:
medical patients with a known SCr baseline 8.1%;
medical patients without a known baseline SCr 6.6%;
surgical patients with a known baseline SCr 4.9%;
surgical patients with no known baseline were 4%.
The medical cohort were significantly older than the surgical cohort (78 years (65–86) vs 67 (51–77), respectively, p≤0.001). There was also a higher frequency of morbidity, particularly heart failure in the medical population who also had longer hospital stays. Inpatient mortality was increased in those with HA-AKI across all four groups, but in absolute terms this was greatest in medical patients where in those with a baseline SCr the observed mortality was 21.5% in those who developed HA-AKI died versus 4.5% in those without HA-AKI (p≤0.001).
For the primary analysis in medical patients with a baseline SCr the AUROC was 0.65 (95% CI 0.62 to 0.67) and the H-L p=0.064. In those patients without a baseline SCr the AUROC was 0.71 (0.67 to 0.75), H-L p=0.014. In surgical patients with baseline SCr: AUROC 0.66 (95% CI 0.62 to 0.69) and H-L p=0.093; Surgery without SCr baseline: AUROC 0.67 (0.58 to 0.75) and H-L p=0.664 (see figure 2).
Calibration plots (figure 3) for the association between predicted probabilities and observed event rates in medical and surgical cohorts (with a baseline SCr) demonstrated agreement at low probability rates while at higher rates calibration deviated in the medical cohort, though the number of events were small. Table 2 compares the predicted rates of HA-AKI (from the original derivation study cohort) compared with observed rates in the validation medical cohort (with baseline SCr), grouped according to APS admission score.15 The table shows in the validation cohort 4% of patients scoring 0–2 APS points developed HA-AKI, compared with 28% of cases scoring ≥7 points with an OR of 4.7 (95% CI 3.1 to 7.2).
At a cut-off of five points on the APS for medical patients with baseline SCr sensitivity was 34% (30–39%), specificity 82% (81–83%), positive predictive value (PPV) 16% (13–18%), negative predictive value (NPV) 94% (93–95%), positive likelihood ratio (PLR) 2.1 (1.8–2.4) and negative likelihood ratio (NLR) 0.79 (0.73-0.84). Lowering the cut-off to a score of three points increased sensitivity to 82% (78–86%), with a reduction in specificity to 37% (36–39%), with PPV 10% (9–12%) and NPV 96% (95–97%), PLR 1.3 (1.25–1.38) and NLR 0.48 (0.39–0.59).
In the medical cohort with a known baseline SCr, HA-AKI was associated with significantly higher inpatient mortality using a cut-off of five points on the APS: 28% of patients who developed HA-AKI died versus 18% with a score <5 points (OR 1.8 (95% CI 1.1 to 2.9), p=0.015). This was also found in patients without a baseline SCr (mortality 42% vs 19%, p=0.029).
This study represents one of the few external validations of a prediction model for HA-AKI—the APS—for acute general medical and surgical cohorts. Moderate discrimination and satisfactory calibration were found in cohorts with and without a known baseline SCr. External validation addresses optimism and generalisability.22–24 ,33 ,34 Discrimination, assessed by AUROC analysis was inferior to the derivation study (0.65 in the comparable medical with baseline SCr vs 0.72, though 95% CIs crossed) a finding commonly found reflecting overfitting from derivation design or modelling (sample size and selection of variables) and clinical factors such as case mix.23 ,33 ,35 However, the observed rates of HA-AKI (8.1% vs 7.2% in the derivation) and associated inpatient mortality were similar in those with the outcome (20.5% vs 21.5%) and those without the outcome (3.5% vs 4.5%) in derivation and validation cohorts, respectively. The validation cohort was older and had higher rates of diabetes and heart failure whereas in contrast, rates of CKD and liver disease were lower. The two physiological observations included in the APS (reduced level of consciousness and tachypnoea) were also less commonly abnormal in the validation cohort (see online supplementary appendix table A2 for summary).
Different methods were used in the two studies, for example, to define CKD (eGFR >1 and <6 months prior to admission in derivation vs national baseline in this external validation). It is unclear to what extent this would have affected the results, though patients could conceivably have already developed AKI in the community. For example, if the last available SCr was a number of months ago, HA-AKI may be overestimated and CA-AKI underestimated. The validation uses the pragmatic national baseline definition that will be applicable in clinical practice and allows automation in an electronic hospital system. The original APS study had relatively few events per variable, as well as high weighting for two infrequent variables (liver disease and conscious level) that may account for reduced model performance in validation. In the larger validation population these variables, though still significant, were rarely present. Furthermore, the derivation cohort included patients who had more than one admission, while the external validation cohort included only first admission during the study period. Patients with recurrent admissions may have been prone to recurrent AKI episodes, and in particular a prior episode of AKI could be incorporated to improve model prediction. This is an area of future interest.
However, currently employed statistical markers of performance also have limitations. Correctly predicting a future event is more complex than diagnostic prediction models, partly due to the time elapsed between prediction and outcome, with a significant stochastic element and therefore accurate risk stratification is often the best that can be achieved. AKI, defined by a nominal change in SCr, may be the manifestation of a number of acute precipitants (eg, sepsis, vomiting) affecting a susceptible host (eg, a patient with heart failure and reduced physiological reserve), making accurate prediction at a single time period (hospital admission) with high discrimination, impossible.36 ,37 Given there is no consensus on what constitutes acceptable discrimination, descriptors such as ‘poor’ or ‘good’ to a particular AUROC are often ascribed, but the AUROC has shortcomings including a narrow focus on accuracy that may not incorporate information on consequences.38–40 For example, if a false-negative result is more harmful than a false-positive result, a model with much greater specificity, but slightly lower sensitivity than another would have a higher AUROC, but may be a poor clinical choice.41 With the assumption of uniform distribution of risk, the maximum AUROC is 0.83.42 In a perfectly calibrated model in a population with an average 10-year risk of 10% with relatively little spread, risk is centred around 10%, and the maximum AUROC is 0.63.40 Though calibration is often overlooked, it is crucial in risk prediction,40 ,43 and in this validation the calibration plots suggest adequate fit between predicted probabilities and observed frequencies of clinical value. More nuanced alternatives are yet to be widely used.41 ,44 ,45
Potential clinical use of the APS
First, using risk ranges, an APS score of 0–2 points (encompassing 36% all medial patients with a baseline SCr) was associated with a low risk of HA-AKI (4%), while a score ≥7 points had a 28% risk.
Second, at five APS points (18% of patients) PPV was relatively low (16% (13–18%)), however, the high NPV 94% (93–95%) suggests an ability to identify those at low risk (as rule out) and importantly even if HA-AKI developed patients (with an APS <5 points) were significantly less likely to die than those with a higher APS.
Third, with only seven variables the APS is relatively simple to calculate. Fourth, all variables can potentially be automatically calculated in (the increasing number of) hospitals with electronic records containing clinical coding and physiological observation systems.
Fifth, the score in this validation performed with similar discrimination in two other cohorts—general surgery and in patients without a baseline SCr.
Finally, the score has potential as an aid memoire to variables associated with HA-AKI, consistently found on systematic review of other AKI prediction models.46
Appraising other models can inform whether a model is worth further investigating or implementing, or highlighting well developed, validated alternatives. In the field of HA-AKI prediction only one other model for general emergency medical and surgical admissions from the UK study (to predict HA-AKI at 72 hours) has external validation evidence, with an AUROC of 0.67 in derivation and 0.71 in external validation.29 The variables included (age, previous admissions, admission diagnostic category, seven laboratory parameters, Charlson comorbidity index and proteinuria) may be difficult to automate and inclusion of admission diagnoses is a shortcoming if use is required at the earliest opportunity. Two older studies, one a retrospective cohort and the other including 27 variables have not been externally validated.27 ,28 Importantly all three models may be difficult to calculate at the bedside. (see supplementary appendix table A3 for a summary of these models with a corresponding Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) guidance reporting score.22
A large number of postsurgical models have been derived principally in cardiac and transplantation surgery, although the outcome is often the use of renal replacement therapy with its inherent flaws and indeed when KDIGO AKI definitions have been used in external validations AUROCs range from 0.61 to 0.74.47–51 In contrast-induced AKI several models have been developed, however, most use specialty specific variables limiting extrapolation to a general population.46 In patients with acute heart failure the single externally validated model produced an AUROC of 0.65 in both external studies.52 ,53 The only externally validated general surgical model derived in the USA had an AUROC of 0.66 when transported to a Chinese population.54 Finally the UK study in Trauma and Orthopaedics performed external validation in the same publication, reporting an AUROC of 0.70.55 Thus, in a number of fields externally validated prediction models for HA-AKI are relatively rare and have moderate discriminatory performance.
This large study is one of the few external validations in general populations.
In England, for example, general medicine, surgery and orthopaedics are the three most common divisions under whom patients are admitted.56 Though few studies break down HA-AKI by specialty, two UK teaching hospital studies suggested over half AKI episodes were in these general specialties.57 ,58 Stringent inclusion criteria (stayed at least one night, repeat SCr performed), would have excluded a large proportion of ‘low-risk’ patients, recognised as such by the clinical team, making the group in question of clinical relevance. Using only the first admission during the study period avoids for the effects of recurrent admissions. It is likely those with multiple attendances would be at higher risk of HA-AKI as well as scoring higher on the APS which takes into account age and (coded) comorbidities, associated with an increased risk of readmissions.59–62 Model updating could be assess the effects of a prior episode of AKI, for example. While discrimination of a single admission APS score was only moderate, this represents a single time point and from an outcome that could occur up to 7 days later. If, for example, a physiological deterioration occurred following admission (with tachypnoea or reduced conscious level) this would be reflected in a higher APS. Patients without a known baseline SCr (a quarter of the medical cohort) pose particular problems: first, CKD contributes to the APS and second assessment of whether the AKI was community or hospital acquired in nature is more difficult. Such patients in general would be lower risk than those with recent available results as the latter would be more likely to have been having tests to monitor chronic disease or prior illnesses. The study demonstrated that in this cohort the APS had similar performance to those with a known baseline SCr.
The single-centre nature of the cohort cautions against generalisability. Also although data were collected in a prospective nature by the electronic hospital record, past medical histories relied on previously coded events on the hospital database, thus they are likely to underestimate disease prevalence. This could be addressed by (planned) improved linkage with primary care databases that could also incorporate information on medications. Moreover, the validation site is in the same geographical area to the derivation site and though overall a more affluent population bears similarities (elderly population on the South East coast of England) that may not make the results widely generalisable. Furthermore, relying on a baseline SCr that could have last been measured a number of months ago means that a proportion of the patients considered to have HA-AKI may well have already fulfilled the criteria for a change in SCr prior to admission, that is, have CA-AKI, which could also be true of those without a baseline SCr.
Prediction of AKI remains of key importance given its association with a high risk of mortality and in survivors, high morbidity.16 Further external validation of prediction models, with updating where necessary is desirable.22 Unfortunately, as AKI is a syndrome reflecting diverse underlying pathophysiological states (usually) imposed on a host with chronic disease, any model at a single time point, is unlikely to predict with high discrimination. Improving models will require comprehensive records (medical and medications through primary care linkage) with fluctuating physiological trends (and potentially laboratory parameters), mirroring the traditional clinical approach to prognosis, with adherence to enablers of clinical uptake.63 ,64 Complex models may enable personalised risk stratification, however, manual inputting or use of a calculator without automation are barriers to bedside use. A balance between comprehensive risk factor inclusion, bedside usability and automated functionality must thus be struck. Drawbacks of using urine output and SCr known to be unpredictably affected by diverse inputs suggest that the employment of more refined biomarkers as predictors and markers of significant AKI are required.65 Finally, no AKI prediction models have undergone impact analysis to assess whether implementation can improve outcomes and this is an area of interest. As no single intervention has been found to improve outcome in AKI, a model would most likely to be used as a part of a systematic alert to risk and initiation of enhanced monitoring in the appropriate clinical location with avoidance of iatrogenic harm.
The APS to predict HA-AKI in an external validation study of general medical and surgical patients performed with moderate discrimination and acceptable calibration. The prediction rule could help identify at admission those patients at higher risk of developing HA-AKI, in order to prevent its occurrence and avoid, or at least mitigate associated significant complications.
Contributors All authors contributed to the paper in accordance with ICMJE recommendations and agree to be accountable for all aspects of the work. LEH contributed to study design, data collection, data analysis, writing up of the paper and final approval of the version submitted. BDD contributed to study design, data analysis, writing up of the paper and final approval of the version submitted. PJR contributed to study design, writing up of the paper and final approval of the version submitted. RV contributed to study design, data analysis, writing up of the paper and final approval of the version submitted. LGF contributed to study design, data analysis, writing up of the paper and final approval of the version submitted.
Funding This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests All authors declare that the results presented in this paper have not been published previously in whole or part. All authors have completed the Unified Competing Interest form at http://www.icmje.org/coi_disclosure.pdf and declare that all have no relationships with companies that might have an interest in the submitted work in the previous 3 years; their spouses, partners, or children have no financial relationships that may be relevant to the submitted work; and have no non-financial interests that may be relevant to the submitted work.
All authors had full access to all of the data (including statistical reports and tables) in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. The lead author affirms that the manuscript is an honest, accurate, and transparent account of the study being reported and no important aspects of the study have been omitted.
Ethics approval Ethical approval was given by NHS Research Ethics Committee London—South East (REC reference 13/LO/0884).
Provenance and peer review Not commissioned; internally peer reviewed.
Data sharing statement No additional data are available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.