Objectives This study is aimed to develop and validate a prediction model for multistate transitions across different stages of chronic kidney disease (CKD) in patients with type 2 diabetes mellitus under primary care.
Setting We retrieved the anonymised electronic health records of a population-based retrospective cohort in Hong Kong.
Participants A total of 26 197 patients were included in the analysis.
Primary and secondary outcome measures The new-onset, progression and regression of CKD were defined by the transitions of four stages that were classified by combining glomerular filtration rate and urine albumin-to-creatinine ratio. We applied a multiscale multistate Poisson regression model to estimate the rates of the stage transitions by integrating the baseline demographic characteristics, routine laboratory test results and clinical data from electronic health records.
Results During the mean follow-up time of 1.8 years, there were 2632 patients newly diagnosed with CKD, 1746 progressed to the next stage and 1971 regressed into an earlier stage. The models achieved the best performance in predicting the new-onset and progression with the predictors of sex, age, body mass index, systolic blood pressure, diastolic blood pressure, serum creatinine, haemoglobin A1c, total cholesterol, low-density lipoprotein, high-density lipoprotein, triglycerides and drug prescriptions.
Conclusions This study demonstrated that individual risks of new-onset and progression of CKD can be predicted from the routine physical and laboratory test results. The individualised prediction curves developed from this study could potentially be applied to routine clinical practices, to facilitate clinical decision making, risk communications with patients and early interventions.
- diabetic nephropathy & vascular disease
- primary care
- statistics & research methods
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
Early predictions for chronic kidney disease (CKD) progression and timely intervention are critical for clinical management of patients with diabetes.
We successfully developed a multiscale multistate Poisson regression model that achieved satisfactory performance in predicting the new-onset and progression of CKDs.
The model incorporates the predictors of demographic characteristics, routine laboratory test results and clinical data from electronic health records.
The individualised prediction curves could potentially be applied to facilitate clinical decision making, risk communications with patients and early interventions of CKD progression.
The cohort has a relatively short follow-up period and the retrospective study design might suffer from report bias and selection bias.
Globally, in 2017, there were 425 million adults with diabetes according to the International Diabetes Federation.1 In Hong Kong, it was estimated that 10.9% of adults aged 20–79 years had diabetes.1 Diabetes is associated with a heavy disease burden and tremendous economic costs.2 A local study reported that the annual direct medical costs of patients with diabetes with newly diagnosed complications were much higher than the cost of US$1984 of uncomplicated cases.3 Chronic kidney disease (CKD) is one of the most common complications of type 2 diabetes. Globally, it was estimated that diabetes attributed to 12%–55% of end-stage renal disease (ESRD).1 In Hong Kong, a prospective cohort of the Hong Kong Diabetes Registry reported that 10% of patients with type 2 diabetes developed into CKD, which was defined as an estimated glomerular filtration rate (eGFR) <60 mL/min/1.73 m2, after 7 years of follow-up.4
Early diagnosis and treatment of CKD in patients with type 2 diabetes are critical for reducing its associated heavy disease burden. Therefore, developing an effective and efficient prediction model for CKD progression is important in terms of prioritising healthcare resources to high-risk populations. The Kidney Disease Improving Global Outcomes (KDIGO) 2012 Guideline lists several predictors for CKD progression based on previous epidemiological studies, including age, sex, ethnicity, obesity, CKD aetiology, GFR, albuminuria, blood pressure, glycaemic control, dyslipidaemia and complications of cardiovascular diseases (CVDs).5 Specifically for people with type 2 diabetes, several prediction models have been reported in the literature, but most were Cox proportional hazards regression models on all-cause mortality and incidence of CVD.6–9 Previous studies have investigated the risk factors that significantly predicted ESRD in Chinese populations10 11, but none focused on CKD progression at earlier stages nor on CKD regression.
The Cox proportional hazards model is one of the most widely adopted classical modelling approaches to predict diabetes complications. However, it only allows two states (no event and event) and linear predictors, hence, it is unable to assess the progression probability across different disease stages. Some studies adopted more complex Cox models to address these limitations, such as the multistate Markov model and the Cox model with time-dependent covariates.12 13 But these models include only a single time scale, which usually does not allow for the simultaneous estimation on the forward and backward transitions of multiple disease stages (ie, progression and regression).
A multiscale multistate Poisson regression (MSMS) model has recently been applied to estimate the risks of transition between different stages of disease progression or the incidences of multiple intermediate states and endpoints such as different hospitalisation episodes.14 This model has the advantage of allowing multiple events in the follow-up period and thereby is capable of investigating dynamic transition across different stages of disease progression and regression. In other words, forward time scales (eg, time since enrolment into the cohort) and backward time scales (time since entry to the intermediate state) are simultaneously entered into the model as covariates. Another advantage of the MSMS model lies in its flexibility of allowing non-linear dynamic transitions by incorporating spline smooth functions in each time scale with full parametric estimation. Here we conducted a population-based retrospective cohort study, with the aim to develop and validate a prediction model for the new-onset, progression and regression of CKD in patients with type 2 diabetes under primary care in Hong Kong.
We obtained individual data of adults who were aged over 18 years at enrolment into the Risk Assessment and Management Programme for Patients with Diabetes Mellitus (RAMP-DM) from July 2014 to June 2017, in the general and specialty outpatient clinics managed by the New Territory West Cluster of the Hospital Authority in Hong Kong Special Administrative Region, China. This cluster served a population of 1.1 million in 2017, and more details about the data source can be found in Yang et al.15 The RAMP-DM Programme aims to structurally screen for diabetic complications among patients diagnosed with diabetes type 1 or type 2 under primary care. All patients are eligible to join this programme without extra costs and on a voluntary basis. Doctors and nurses from different hospitals and clinics were regularly trained and followed the same protocol of RAMP-DM. In Hong Kong, more than 90% of people with diabetes have been enrolled in the RAMP-DM since 2014.16 A local study also demonstrated that the patients who were not enrolled in the RAMP-DM were not significantly different from those who enrolled, in terms of sociodemographic and clinical characteristics.16
We also retrieved the longitudinal data of physical examinations and laboratory test results during scheduled clinical visits, including incidence of diabetic complications, blood pressure, eGFR (calculated from urine creatinine), urine albumin-to-creatinine ratio (UACR), total cholesterol, serum levels of creatinine, high-density lipoprotein (HDL), low-density lipoprotein (LDL), triglycerides and serum haemoglobin A1c (HbA1c), together with annual prescriptions of angiotensin-converting-enzyme inhibitors (ACEI) and angiotensin II receptor blockers (ARB), from the Clinical Data Analysis and Reporting System from 1 January 2014 to 31 December 2017, by matching their unique patient reference numbers. All adults aged above 18 years with clinical diagnosed type 2 diabetes with or without complications, according to the International Classification of Disease Ninth Revision codes (250.00, 250.02, 250.10, 250.12, 250.20, 250.22, 250.30, 250.32, 250.40, 250.42, 250.50, 250.52, 250.60, 250.62, 250.70, 250.72, 250.80, 250.82, 250.90, 250.92), were included in the analysis. The participants who already had ESRD at enrolment, without HbA1c tests taken during the follow-up, and/or having incomplete baseline data were excluded from data analysis.
Patient and public involvement statement
Patients were not involved in the recruitment to and conduct of the study, since all data were retrospectively retrieved from electronic medical records.
Definition of CKD new-onset, progression and regression
According to the KDIGO 2012 Clinical Practice Guideline for Evaluation and Management of Chronic Kidney Diseases,5 CKD can be classified into four risk stages for CKD outcomes based on eGFR and UACR levels: stage 0, low risk; stage 1, moderately increased risk; stage 2, high risk; and stage 3, very high risk. Here we defined three categories of stage transitions as the event: (1) new-onset of CKD, a transition from stage 0 to 1; (2) CKD progression, a forward transition to the next advanced stage (ie, stage 1 to 2 and 2 to 3); and (3) CKD regression, a backward transition to the next stage (ie, stage 1 to 0, 2 to 1 and 3 to 2). Cases with the transition over two stages with the intermediate stage recorded were counted as two separate events, and those without were excluded from the study. For example, two events were included in the model if the subject was classified into stage 0 at the beginning and progressed into stage 1 and then stage 2 in the follow-up period. If a subject jumped from stage 0 to 2 without a record of stage 1, then his/her data were not included in the model.
The eGFR and UACR test results taken on the same date of CKD outcome events (new-onset, progression and regression) were matched by the unique patient numbers. If these tests were conducted on different dates, then those taken closest to the event dates and within 180 days were matched to the CKD event dates with the unique patient numbers. Baseline sociodemographic data were collected at the initial visits to the clinics, including age, sex, education level, body mass index (BMI) and the presence of diabetic complications. Longitudinal data of blood pressure, lipid profile, serum creatinine, eGFR, UACR and HbA1c levels tested during the clinical visits, together with any ACEI/ARB prescriptions within the year (yes or no), were also matched for each participant by the dates of follow-up period between recruitment dates and event dates, or between recruitment dates and censoring dates if no events occurred by the end of the follow-up period.
The technical details of MSMS models can be found in Iacobelli and Carstensen,14 Carstensen and Plummer.17 We used calendar time as the basic time scale, and the risk of transition was calculated for the period from the enrolment time to the laboratory test dates with stage change. The incidence rate of the new-onset of CKD (or progression and regression) was estimated from the following formula:
where t is the time since enrolment, E[μ(t)|α] is the expected crude incidence rate of CKD new-onset (or progression and regression) from the diagnosis of the previous stage to time t, which is assumed to follow a Poisson distribution; τ is the time when last laboratory tests are conducted for CKD (t > τ≥0); α is the most recent HbA1c levels prior to the progression events; xi s are the covariates of age, sex, BMI, systolic blood pressure (SBP), diastolic blood pressure (DBP), serum creatinine, total cholesterol, HDL, LDL, triglycerides that were tested at enrolment, and annual drug prescriptions of ACEI and ARB; μ0 ( ) is the natural splines function of the period ( ) during which the transition occurs; h0 (α) denotes the natural splines of HbA1c levels (α). Here we assume non-linear effects of time and HbA1c level, because the incidence rate of CKD new-onset (or progression and regression) could probably increase over time in a non-linear pattern. Unlike the model proposed by Iacobelli and Carstensen,14 we only included one time scale in this model to simplify the model structures because there were very limited numbers of participants who experienced both the regression and progression events in the follow-up period.
We first randomly extracted 90% of the cohort data as a training dataset to fit the MSMS model by ninefold cross-validation.18 The remaining 10% of data were used as the test dataset to evaluate the prediction accuracy of the best-fit model for internal validation. For each outcome, we built four different models: model 1 included sex, age, BMI and HbA1c as predictors; model 2, blood pressure and total cholesterol were added to model 1; model 3, serum creatinine, HDL, LDL and triglycerides were added to model 2; model 4, annual prescription of ACEI and ARB were added to model 3. These models with different combinations of predictors were compared using the area under the receiver operating characteristic curve (AUC) and mean absolute error.19 20 The Nagelkerke’s scaled R 2 was adopted to measure the goodness-of-fit of model.21
Individual incidence rates of CKD new-onset, progression and regression over time since enrolment could be predicted from the final MSMS model by inputting the observed data of each subject. It is of note that the time between laboratory tests and clinical visits was not consistent, as they were determined by clinical indications and hospital manpower. Hence, to adjust for irregularly scheduled clinical visits and tests, we calculated an adjusted incidence rate by adding weights of the probability density function of clinical visits and test frequency, with the following equation , where ν(t) is the adjusted CKD incidence rate; g(t|κ) denotes the Gaussian kernel density function of clinical visits and tests. An example of this adjustment is shown in the online supplementary appendix 1.
All the data analyses were conducted in R (V.3.4.1) and the codes can be found online (https://github.com/yanglinpolyu/DMforecastmodel).
A flowchart for data collection and analysis is shown in figure 1. There were 40 781 people with diabetes who were enrolled in the RAMP-DM from July 2014 to June 2017. Of them, 39 652 people with type 2 diabetes were considered eligible for this study. We excluded 8078 patients with incomplete eGFR and UACR data, 5002 with only one record of eGFR and UACR and 375 with incomplete HbA1c record. In the final sample of 26 197 patients, 48.8% were men, the mean age at enrolment was 61.5 years and 70.8% did not have CKD (stage 0). At the end of the follow-up period, there were 2632 cases of CKD new-onset, 1746 progression and 1971 regression (table 1). Compared with the patients without change, those with the new-onset or progression or regression of CKD were older, more likely to be female individuals, with lower education level, higher BMI and having more complications (table 2). The mean (range) of the follow-up period was 1.83 (0.07–4.04), 1.77 (0.13–3.84), 1.43 (0.08–3.72) and 1.89 (0.07–4.04) years, for all patients, patients with CKD new-onset/progression, with regression and without change, respectively.
Model performance and predictors
The MSMS model achieved better performance in predicting progression than in predicting the regression of chronic kidney disease (AUC 0.72–0.84 vs 0.50–0.71) (online supplementary appendix 2). The significant predictors for CKD new-onset included female individuals, older age, having ACEI and ARB prescriptions, high levels of BMI, SBP, total cholesterol and serum creatinine, low levels of DBP, HDL and LDL (online supplementary appendix 3). Fewer significant predictors were found for those who progressed from stage 1 to 2 and from 2 to 3, and the effect estimates of significant predictors were similar to those of new-onset. The only exception is that lower BMI was associated with a higher risk of progression from stage 2 to 3.
Rates of new-onset and progression
To facilitate the comparison between groups, we plotted the prediction curves of adjusted rates per 1000 person-year by assuming the mean values of each predictor in the MSMS model: BMI=25.7, SBP=130 mm Hg, DBP=75 mm Hg, total cholesterol=4.0 mmol/L, creatinine=75 mmol/L, HDL=1.20 mmol/L, LDL=2.10 mmol/L and triglycerides=1.21 mmol/L. The mean age was 56.4 years for female individuals aged <65 years, 73.7 years for female individuals aged 65+ years, 56.1 years for male individuals aged <65 years and 72.4 years for male individuals aged 65+ years.
The temporal trend of adjusted rates of CKD new-onset and progression from the MSMS model across different levels of HbA1c are shown in figures 2 and 3, for women and men, respectively. The CKD progression rate dramatically increased in women 2 years after diagnosis, but this change was less evident in men. An exponential increasing trend of incidence rates over time was observed for all three types of CKD progression and gradually elevated when the HbA1c level increased from 6% to 9%. The new-onset rate was high within the first year of enrolment (diagnosed as stage 0), particularly among older adults aged 65 years or over. The progression rate from stage 2 to 3 dramatically increased 2 years after the diagnosis of stage 2.
Rates of regression
The significant predictors for regression from stage 1 to 0 include male individuals, younger age, lower levels of SBP and creatinine, without ACEI or ARB prescriptions (online supplementary appendix 3). The stage 2–1 and 3–2 regression has similar predictors, including male individuals, older age, lower levels of SBP and creatinine. Higher levels of DBP and without ACEI are only significant for the stage 3–2 regression.
The regression from earlier stages shows a less evident temporal pattern in women and men (online supplementary appendix 4 and 5). The adjusted incidence rates were high during the first half year and then quickly declined to very low levels after 1 year. The regression rates were higher in the group from stage 2 to 1, followed by 1 to 0 and stage 3 to 2 in both sex groups. The regression rates of older people tend to be slightly higher than the younger group.
Early predictions of complication incidence and disease progression are of great concern in the clinical management of patients with type 2 diabetes. The MSMS model has recently been introduced to evaluate the risk of chronic diseases.22–24 Its advantages over the classical proportional hazards (PH) model have been discussed in the literature.14 17 25 This novel modelling approach allows the simultaneous prediction of multistate transitions, either progression or regression and thereby is capable of assessing both linear and non-linear effects of predictors. Previous studies have demonstrated that the MSMS model outperformed the PH model in both model structure and fitting performance.14 26–28 Our study is among the first to apply this model to predict the risks of CKD progression and regression in a large sample of the Chinese population with type 2 diabetes under primary care. The advantage of the MSMS model also lies in its ability to incorporate non-linear effects. We observed rapidly increasing progression rates between different stages among female patients, but to a lesser extent among male patients (figures 2 and 3). Particularly, a turning point of the progression rates was found at 1 year postdiagnosis of the previous stage, whereas the regression rates peaked at half a year. This suggests that 1 year is a critical control window to reverse the deterioration of nephropathy among patients with type 2 diabetes. Another critical time point is 2 years postdiagnosis, beyond which the regression rates for both genders dramatically decreased, suggesting an irreversible trend of kidney function deterioration.
Based on the data availability and association with disease progression, we considered a series of predictors including demographic characteristics, clinical presentations, drug prescriptions and laboratory tests routinely conducted in these patients. Interestingly, a larger number of significant predictors were successfully identified by the MSMS model for new-onset than for progression. Nearly all these predictors significantly predicted the CKD new-onset, whereas for progression, only age, gender, BMI, SBP and serum creatinine remained significant predictors. These findings highlight the importance of early interventions for patients with type 2 diabetes when their renal functions maintain within the normal range and many risk factors are modifiable.
A few prediction models for progression of CKD have been developed in the literature13 29 30 and some were specifically for patients with diabetes.31–33 This study is among the first to define CKD progression (or regression) as a forward (or backward) transition between stages of proteinuria and deterioration of eGFR. In this study, we found that female individuals, older age, ACEI and ARB prescriptions, higher BMI, SBP, total cholesterol and creatinine, lower DBP, HDL and LDL were associated with a higher risk of CKD new-onset and progression. However, Tangri et al reported that younger age and male individuals had a faster progression to renal failure in their PH model for patients with CKD of various aetiologies.34 This discrepancy might be due to the different outcome variables and study populations, since our cohort included only patients with type 2 diabetes and the outcomes were a progression from earlier stages. The predictors of our model are similar to an early prediction model that was developed using the PH model for CVDs of patients with type 2 diabetes in Hong Kong.35 This CVD model found that older age, longer duration of type 2 diabetes and higher HbA1c predicted a higher risk of CVD incidence.
This study is among the first to investigate the predictors of CKD regression in patients with type 2 diabetes under primary care, to our best knowledge. We found that younger age had a faster regression from stage 1 to 0, but slower from stage 2 to 1 or 3 to 2. This suggests that, once younger adults enter the advanced stages, it is less likely to reverse the progression.
There is ample evidence in the literature to demonstrate that both ARB and ACEI could slow down the progression of CKD. However, we found that patients with type 2 diabetes who were prescribed with ARB and ACEI had a dramatically faster progression rate of CKD and a lower regression rate. In our data, these drugs were significant only in the transition of stages 0 to 2 and only ACEI prescription predicts a lower rate of regression from 3 to 2. We speculate that patients who are prescribed with ACEI and ARB more likely had a high-risk profile than did those without these prescriptions. Future research would be useful to investigate whether the association found in our study was truly due to the drugs themselves or confounding.
Previous studies including a few from Hong Kong have reported that men with type 2 diabetes had a higher risk of CKD incidence than women.4 On the contrary, we found that women had a faster rate of CKD progression and men had a higher rate of regression in our cohort. Interestingly, we also found that female patients with different drug usage showed distinct temporal patterns of CKD progression, whereas fewer discrepancies were observed in the male groups. This gender heterogeneity has been well recognised in the development of diabetic complications, but not in drug efficacy. Further studies are warranted to explore this important gender heterogeneity.
The results of this study have important clinical implications. Few previous studies have considered the prediction model for progression of early stages, however, a timely intervention on these early stages in the primary care level could have a significant impact to prevent or slow down the further progression to the end-stage renal failure. This would potentially reduce the disease burden and economic burden on the individual levels as well as on the secondary care of the healthcare system. Our prediction models were developed from the routine clinical data of electronic health records and thereby it could be automatically integrated into the existing clinical information system and easily adopted by clinicians to visualise the CKD progression risks in primary care settings. The graphic presentation of individualised CKD risk prediction could be a useful tool to facilitate risk communication and health education for disease management of diabetes.
Interestingly, we found a small peak of CKD progression risks within half to 1 year since enrolment, even after we adjusted for check-up frequency (figures 2 and 3). It is of note that these participants had different lengths of years with type 2 diabetes and some already had developed complications when first enrolled into the RAMP-DM Programme. We speculate that these patients progressed faster than others, but it was actually due to their delayed laboratory tests and clinical visits. Furthermore, although we adjusted the intervals of follow-up checks in the model by calculating adjusted incidence rates, it is difficult to completely eliminate the impact of inconsistent intervals of clinical visits and laboratory tests. Therefore, the peaks within 1 year need cautious interpretations. Ideally, the prediction risk curve should be plotted over the time since diagnosis. However, this could be very difficult to achieve, as many patients did not have timely laboratory tests for diagnosis. In fact, many prediction models also used the year since enrolment as a time indicator, including a few local studies.8 35
There are a few limitations to this study. First, although we had a large sample size, the follow-up period was relatively short. CKD stages were defined by one test result due to the relatively short study period. Therefore, the predicted rates beyond 2 years tend to have wide confidence intervals. Second, a large number of patients in RAMP-DM were excluded due to incomplete eGFR, UACR and HbA1c data. It is unclear to us whether some were not scheduled to take tests due to mild conditions or others skipped clinical visits for some reason. Therefore, some of our model estimates might slightly overestimate or underestimate the true risks. Future studies with longer follow-up periods are warranted to obtain more accurate predictions. Last but not least, the study population were Chinese patients under primary care in a highly developed economic region. Future studies are warranted to apply our model to other populations as external validation and calibration, so that the models can eventually become a useful tool in clinical practice.
The MSMS model achieved satisfactory performance in predicting the progression of CKD in patients with type 2 diabetes under primary care. The prediction model developed from this study could be applied to build an online calculator for individual risks of CKD progression. This will greatly facilitate clinical decision making of individualised intervention plan and treatment target to slow the progression of CKD in Chinese adults with type 2 diabetes, based on biomarkers of glycaemic control, cardiovascular and renal function. The online calculator will also improve the risk communications of doctors and nurses with patients.
The authors thank Nick Lau for assistance in data collection and Jay Hebert of The Hong Kong Polytechnic University for proofreading the manuscript.
Contributors LY and TKC designed the study and prepared the manuscript. JLian, CWL and JLiang contributed to data collection and result interpretation. SZ, DH and JQ conducted the data analysis. All authors have reviewed and proved the final version of this manuscript for publication.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Patient consent for publication Not required.
Ethics approval The ethical approval has been obtained from the New Territory West Cluster Clinical and Research Ethics Committee (NTWC/CREC/17085). Consent forms were exempted because all the anonymised data were extracted from the computerised data system of the Hospital Authority, and no personal information was collected for this study.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data may be obtained from a third party and are not publicly available. The data that support the findings of this study are available from the Hospital Authority of Hong Kong but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of the Hospital Authority.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.