Article Text

Original research
Systematic review of diagnostic and prognostic models of chronic kidney disease in low-income and middle-income countries
  1. Diego J Aparcana-Granda1,2,
  2. Edson J Ascencio1,3,4,
  3. Rodrigo M Carrillo Larco2,5
  1. 1School of Medicine ‘Alberto Hurtado’, Universidad Peruana Cayetano Heredia, Lima, Peru
  2. 2CRONICAS Centre of Excellence in Chronic Diseases, Universidad Peruana Cayetano Heredia, Lima, Peru
  3. 3Health Innovation Laboratory, Institute of Tropical Medicine ‘Alexander von Humboldt’, Universidad Peruana Cayetano Heredia, Lima, Peru
  4. 4Emerge, Emerging Diseases and Climate Change Research Unit, School of Public Health and Administration, Universidad Peruana Cayetano Heredia, Lima, Peru
  5. 5Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK
  1. Correspondence to Dr Rodrigo M Carrillo Larco; rcarrill{at}ic.ac.uk

Abstract

Objective To summarise available chronic kidney disease (CKD) diagnostic and prognostic models in low-income and middle-income countries (LMICs).

Method Systematic review (Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines). We searched Medline, EMBASE, Global Health (these three through OVID), Scopus and Web of Science from inception to 9 April 2021, 17 April 2021 and 18 April 2021, respectively. We first screened titles and abstracts, and then studied in detail the selected reports; both phases were conducted by two reviewers independently. We followed the CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies recommendations and used the Prediction model Risk Of Bias ASsessment Tool for risk of bias assessment.

Results The search retrieved 14 845 results, 11 reports were studied in detail and 9 (n=61 134) were included in the qualitative analysis. The proportion of women in the study population varied between 24.5% and 76.6%, and the mean age ranged between 41.8 and 57.7 years. Prevalence of undiagnosed CKD ranged between 1.1% and 29.7%. Age, diabetes mellitus and sex were the most common predictors in the diagnostic and prognostic models. Outcome definition varied greatly, mostly consisting of urinary albumin-to-creatinine ratio and estimated glomerular filtration rate. The highest performance metric was the negative predictive value. All studies exhibited high risk of bias, and some had methodological limitations.

Conclusion There is no strong evidence to support the use of a CKD diagnostic or prognostic model throughout LMIC. The development, validation and implementation of risk scores must be a research and public health priority in LMIC to enhance CKD screening to improve timely diagnosis.

  • chronic renal failure
  • epidemiology
  • public health
  • nephrology

Data availability statement

Data sharing not applicable as no datasets generated and/or analysed for this study.

https://creativecommons.org/licenses/by/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • An extensive search was conducted, involving five major databases (Medline, Embase, Global Health, Scopus and Web of Science).

  • A comprehensive list of available chronic kidney disease diagnostic and prognostic models and their limitations is provided, which were not previously accounted for in the low-income and middle-income country population.

  • This study adhered to Preferred Reporting Items for Systematic Reviews and Meta-Analyses, CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies and Prediction model Risk Of Bias ASsessment Tool guidelines.

  • Meta-analysis was not possible due to the heterogeneity in the measurement of outcomes.

  • Additional data sources such as grey literature were not retrieved.

Introduction

Chronic kidney disease (CKD) is a condition with a large burden globally. Between 1990 and 2017, the health metrics of CKD showed a bleak profile: mortality, incidence and kidney transplantation rates increased by 3%, 29% and 34%, respectively.1 CKD led to 1.2 million deaths in 2017 and in the best-case scenario, CKD mortality will increase to 2.2 million deaths and become the fifth cause of years of life lost by 2040.2 CKD reveals disparities between low-income and middle-income countries (LMICs) and high-income countries (HICs). In the period 1990–2016, the age-standardised disability-adjusted life-years due to CKD was the highest in LMIC,3 where they need to optimise CKD early diagnosis.

Risk scores are a cost-effective alternative for CKD screening and early diagnosis.4 These equations require less resources and contribute to decision making,5 and allow screening of large populations.4 Many of the available CKD risk scores have been developed in HIC,6–8 and they may not be used in LMIC without recalibration to secure accurate predictions. How many CKD risk scores there are for LMIC, and what their strengths and limitations are, remains largely unknown.9 10 This limits our knowledge of what tools there are to enhance CKD screening in LMIC. Similarly, this lack of evidence prevents planning research to overcome the limitations of available models. To fill these gaps and to inform CKD screening strategies in LMIC, we summarised available CKD diagnostic and prognostic models in LMIC.

Methods

Protocol and registration

This systematic review and critical appraisal of the scientific literature was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines statement11 (online supplemental table S1). Protocol is available elsewhere12 and in online supplemental text S1. We followed the CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies (CHARMS) guidelines.13 14

Information sources

We searched Medline, EMBASE, Global Health (these three through OVID), Scopus and Web of Science from inception to 9 April 2021, 17 April 2021 and 18 April 2021, respectively. The search strategy is available in online supplemental table S2. We also screened the references of relevant systemic reviews10 and of the selected studies.

Eligibility criteria

We sought models which assessed the current CKD status (ie, diagnostic) or future CKD risk (ie, prognostic), aiming to inform physicians, researchers and the general population (table 1). Reports could include model derivation, external validation or both. The target population was adults (≥18 years) in LMIC according to The World Bank.15

Table 1

CHARMS criteria to define research question and strategy

Study selection

Reports were selected if the study population included people who were from and currently living in LMIC. Cross-sectional (diagnostic models) and longitudinal studies (prognostic models) with a random sample of the general population were included. The outcome was CKD based on a laboratory or imaging test (isolated or in combination with self-reported diagnosis): urine albumin-creatinine ratio, urine protein-creatinine ratio, albumin excretion ratio, urine sediment, kidney images, kidney biopsy or the estimated glomerular filtration rate (eGFR).12

Reports had to present the development and/or validation of a multivariable model. On the other hand, reports with LMIC populations outside LMIC, or those including foreigners living in LMIC, were excluded. Reports that only studied people with underlying conditions (eg, patients with diabetes), people with a specific risk factor (eg, alcohol consumption) or a hospital-based population, were excluded. We also excluded models that were developed using machine learning techniques due to their usually poor report of performance metrics, as noted from previous reviews.16 17 To overcome this limitation, CHARMS and Prediction model Risk Of Bias ASsessment Tool (PROBAST) tools are currently being adapted to machine learning methodology but are yet to be published.18

Data collation

We used EndNote20 and Rayyan19 to remove duplicates from the search results. We used Rayyan19 to screen titles and abstracts by two reviewers independently (DJA-G and EJA); discrepancies were solved by consensus. Two reviewers independently (DJA-G and EJA) studied the full length of the reports selected in the screening phase; discrepancies were solved by consensus. If consensus was not reached, a third party was consulted (RMCL). A data extraction form based on the CHARMS guidelines14 was developed and not modified during data collation. Data were extracted as presented in the original reports by two reviewers independently (DJA-G and EJA); discrepancies were solved by consensus.

Risk of bias of individual studies

We used the PROBAST to assess the risk of bias of diagnostic and prognostic models.20 21 Two reviewers (EJA and DJA-G) independently ascertained the risk of bias of individual reports; discrepancies were solved by consensus or a third party (RMCL).

Synthesis of results

A qualitative synthesis was conducted whereby the characteristics of the selected models was comprehensively described.12 Quantitative analysis (meta-analysis) was not conducted because the selected models used different predictors and they had different outcome definitions.

Patient and public involvement

No patient involved.

Results

Reports selection

The search yielded 14 845 reports. After removing duplicates (1462 articles), we screened 13 383 titles and abstracts. Then, 11 reports were selected, 1 of them was not available as full text,22 and the rest (10 articles) were studied in detail. We excluded one report because the study population was not randomly selected,23 and another report because it was conducted in an HIC.24 Additionally, one report was identified by reference searching.25 Finally, nine reports (n=61 134) were included in the qualitative synthesis (figure 1).

Figure 1

PRISMA 2020 flow diagram. PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

General characteristics of the selected reports

Original reports were from Iran,26 India,27 Peru,28 South Africa,25 two from China29 30 and three from Thailand31–33 (online supplemental figure S1). All studies were developed on community-based populations with random sampling (online supplemental table S3).

Overall, Wu et al studied the largest sample size (n=14 374) which was a population of workers who underwent health checks30; conversely, the smallest sample was studied by Mogueo et al (n=902).25 The oldest data were collected in 199926 whereas the most recent study was published in 2018.26

The sample size analysed to derive the diagnostic models ranged from 236828 to 14 374 people,30 and from 90225 to 494027 for the validation models. The mean age of participants in the derivation models varied from 44.9 to 57.7 years, and the proportion of male subjects ranged from 46.8% to 70.5%.27–30 32 33 The mean age of participants in the validation models varied from 41.8 to 57.1 years, and the proportion of male subjects ranged from 23.4% to 75.5%25–28 30–32 (table 2; online supplemental table S3).

Table 2

General characteristics

The number of CKD cases varied greatly in the derivation models, from 8128 to 94727; the corresponding numbers in the validation models were 2732 and 1359.26 Of note, number of CKD cases could not be extracted from the validation work by Bradshaw et al.27 The ratio of outcome events per number of candidate predictors in the derivation models ranged from 2.328 to 135.3.27 This ratio could not be calculated for the derivation models by Wen et al29 and Wu et al.30 Across all reports, missing data were handled by conducting a complete-case analysis25–32; this information was not available in the study by Thakkinstian et al33 (table 2; online supplemental table S3).

What has been done?

In 2011, Thakkinstian et al derived one model using cross-sectional data.33 In 2015, Mogueo et al used cross-sectional data to validate two models that were previously developed in South Korea and Thailand using two different outcome definitions for each model, that is, they provided estimates for four model validations.25 In 2016, Wu et al used cross-sectional data to derive and validate one model, that is, they provided estimates for two models (one derivation and one validation).30 In 2017, Carrillo-Larco et al used cross-sectional data to derive and validate two models, that is, they provided estimates for four models (two derivations and two validations).28 Saranburut et al prospectively validated the Framingham Heart Study risk score on a cohort using two different outcome definitions, that is, they provided estimates for two model validations.31 Saranburut et al prospectively developed four models and validated two of them using cohort data, that is, they provided estimates for six models (four derivations and two validations).32 In 2019, Bradshaw et al used cross-sectional data to derive four models, one of them was validated on two populations (rural and urban), that is, they provided estimates for six models (four derivations and two validations).27 In 2020, Asgari et al prospectively validated a model from the Netherlands for 6- and 9 years CKD prediction, that is, they provided estimates for two model validations.26 Wen et al prospectively derived two models.29 Overall, 14 models were derived and fifteen underwent validation (hence the 29 rows in table 4).

Outcome ascertainment

Across all reports, CKD was defined as eGFR <60 mL/min/1.73m225–33 assessed by either the Modification of Diet Renal Disease (MDRD) formula25 26 28 29 31 33 or the CKD Epidemiology Collaboration (CKD-EPI) formula.27 30–32 In addition to the eGFR assessment, Bradshaw et al27 and Wen et al29 defined CKD as a urinary albumin-to-creatinine ratio (UACR) ≥30 mg/g. Mogueo et al validations also considered CKD as any nephropathy including stages I–V of the ‘Kidney Disease: Improving Global Outcomes‘ classification.25 Thakkinstian et al also considered CKD as eGFR ≥60 mL/min/1.73 m2 if it had haematuria or UACR ≥30 mg/g33 (table 2).

Predictors and modelling

Logistic regression analysis was conducted in all derivation models.27–30 32 33 Selection of the final predictors was based on modelling techniques: backward27 28 and forward selection29 30 32 33 (online supplemental table S3). All studies categorised numerical variables. The most frequent predictors included in the models were: age, diabetes mellitus and sex (online supplemental figure S2).

Model performance

All studies reported calibration and discrimination metrics, except for the validations by Bradshaw et al27 and Carrillo-Larco et al28 (online supplemental table S3). Regarding discrimination metrics, the area under the receiver operating characteristic curve and C-statistic were over 63%31 and 70%,27 respectively. Among all studies, sensitivity ranged from 56.8%29 to 84.0%,25 specificity ranged from 65.1%29 to 86.3%,30 positive predictive value (PPV) ranged from 8.8%28 to 33.8%,29 and negative predictive value (NPV) ranged from 89.4%29 to 99.1%.28 The NPV was the best metric, consistently above 89.4% (table 3).

Table 3

Performance metrics

Risk of bias

All studies showed a high risk of bias due to insufficient or inadequate analytical reporting. The flaw regarding the analysis criteria can be explained by how original reports handled missing data and predictors categorisation. The participants and predictors criteria had low risk of bias in most of the reports. Most of the individual reports demonstrated an inappropriate evaluation of performance metrics.26 28–33 Low applicability concern was noted (table 4; online supplemental table S4).

Table 4

Risk of bias (RoB) assessment of individual diagnostic/prediction models

Discussion

Main findings

This systematic review summarised all available risk scores for CKD in LMIC. In so doing, we provided the most comprehensive list of CKD risk scores to enhance primary prevention and early diagnosis of CKD in LMIC. Although the available models had acceptable discrimination metrics and, when available, acceptable calibration metrics, these models had serious methodological limitations such as a reduced number of outcome events. The best performance metric across risk scores was the NPV. Overall, CKD risk prediction tools in LMIC need rigorous development and validation so that they can be incorporated into clinical practice and interventions. The available evidence would not support using any of the available CKD risk scores across LMIC.

Limitations of the review

We did not search grey literature. We argue that this limitation would not substantially change our results because these sources are most likely not to have included a random sample of the general population and are likely to have included a small sample size with few outcome events. That is, we would not expect to find a report in the grey literature with a much better methodology than that of the studies herein summarised.

Limitations of the selected reports

Several LMIC do not have a CKD risk score, particularly countries in Central America and Oceania. This should encourage public health officers and researchers to develop CKD prediction models. They could conduct new epidemiological studies or leverage on available health surveys with kidney biomarkers. These models could have pragmatic and direct applications in clinical medicine, by providing a tool for early identification of CKD cases. Similarly, these models could inform public health interventions and planning, by providing a tool to quantify the size of the population likely to have or to develop CKD.

Clinical guidelines state that CKD is defined as a sustained structural or functional kidney damage for ≥3 months.34 In the studies herein summarised, CKD was defined at one point in time. Future work could expand the definition of CKD to also incorporate the lapse during which the patient had kidney damage. In addition, different procedures were used to define CKD including eGFR, proteinuria, and UACR. Even among those studies in which CKD was defined with eGFR, they used different equations to compute the eGFR. Researchers and practitioners in LMIC could agree on the best and most pragmatic as well as cost-effective definition of CKD, so that future models could use this definition. This would improve the comparability and extrapolability of the models.

All reports in which a new CKD risk score was developed selected the predictors through univariate analyses,27–30 32 33 which is not be the best approach to choose predictors.35–37 Ideally, predictors should be selected based on expert knowledge, or among those with the strongest association evidence with CKD. In a similar vein, predictors selection should be guided by the target population. For example, CKD prediction models for populations in LMIC should prioritise simple biomarkers or inexpensive clinical evaluations (eg, blood pressure). In this way, the risk score is likely to be used in clinical practice in resource-limited settings. Another relevant methodological limitation was how the original reports handled missing data. To the extent possible, multiple imputation should be implemented to maximise available data and to avoid potential bias by studying only observations with complete information.

Calibration assesses the degree of agreement between actual outcomes and model prediction, whereas discrimination is the ability of the model to differentiate people with and without the outcome. Calibration metrics need to be consistently reported and should inform the direction of the miscalibration. Most of the studies used the Hosmer-Lemeshow χ2 test as the calibration metric. Unfortunately, this test does not inform on whether the model prediction is overestimating or underestimating the observed risk; calibration plots are a useful alternative. Therefore, it was not always possible to reach strong conclusions about the performance of the available models. Prognostic models should be updated before they can be applied in a new target population. This process is known as recalibration. Because we found a handful of prognostic models in some countries, it is debatable whether these can be successfully used in other populations. Available prognostic models for CKD would need to be recalibrated and independently validated in new target populations.

Clinical and public health relevance

The Latin American Society of Nephrology and Hypertension (Sociedad Latinoamericana de Nefrología e Hipertensión) recommends to annually screen for CKD with several markers: blood pressure, serum creatinine, proteinuria and urinalysis.38 The South African Renal Society guidelines also recommend CKD screening annually, yet they focus on high-risk populations: people with diabetes, hypertension, or HIV.39 This recommendation is endorsed by the Asian Forum for Chronic Kidney Disease Initiatives, extending it to individuals ≥65 years, people consuming nephrotoxic substances, and those with family history of CKD and past history of acute kidney injury.40 Although it seems reasonable to screen people with risk factors such as hypertension and diabetes, this approach may miss a large proportion of the high-risk population because they could be unaware of their condition.41 42 In this case, risk scores could be useful because they can be applied to large populations regardless of whether they are aware of their hypertension or diabetes status. Unfortunately, our work would not support nor encourage the inclusion of available risk scores for CKD in clinical guidelines in LMIC. Instead, our results urgently call to improve risk prediction research in LMIC. Therefore, CKD risk scores could be included into clinical practice to identify high-risk individuals and to inform the patient’s management plan as is the case in other fields such as cardiovascular primary prevention.

Conclusions

This systematic review of diagnostic and prognostic models of CKD did not find conclusive evidence to recommend the use of a single CKD score across LMIC. Nonetheless, we identified relevant efforts in Iran, India, Peru, South Africa, China and Thailand; these models would require further external validation before they can be applied in other LMIC. We encourage researchers and practitioners to develop and validate CKD risk scores, which are cost-efficient tools to early identify CKD prevalent and incident cases so that they can receive timely treatment.

Data availability statement

Data sharing not applicable as no datasets generated and/or analysed for this study.

Ethics statements

Patient consent for publication

Ethics approval

This review was deemed as a low risk because human subjects were not directly involved.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Contributors RMCL, DJA-G and EJA conceived the idea. RMCL, DJA-G and EJA conducted the search. DJA-G and EJA wrote the manuscript. All authors approved the submitted version and are responsible of its content. All authors act as guarantors.

  • Funding RMCL is supported by a Wellcome Trust International Training Fellowship (214185/Z/18/Z).

  • Map disclaimer The inclusion of any map (including the depiction of any boundaries therein), or of any geographic or locational reference, does not imply the expression of any opinion whatsoever on the part of BMJ concerning the legal status of any country, territory, jurisdiction or area or of its authorities. Any such expression remains solely that of the relevant source and is not endorsed by BMJ. Maps are provided without any warranty of any kind, either express or implied.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.