Article Text

Original research
Risk assessment for acute kidney injury and death among new COVID-19 positive adult patients without chronic kidney disease: retrospective cohort study among three US hospitals
  1. Daniel Li1,2,
  2. Hui Ren3,
  3. Dirk J Varelmann4,
  4. Pankaj Sarin4,
  5. Pengcheng Xu3,
  6. Dufan Wu3,
  7. Quanzheng Li3,
  8. Xihong Lin5,6
  1. 1Harvard University T H Chan School of Public Health, Boston, Massachusetts, USA
  2. 2Johns Hopkins School of Medicine, Baltimore, Maryland, USA
  3. 3Department of Radiology, Harvard Medical School, Boston, Massachusetts, USA
  4. 4Department of Anesthesiology, Brigham and Women's Hospital, Boston, Massachusetts, USA
  5. 5Department of Biostatistics, Harvard University T H Chan School of Public Health, Boston, Massachusetts, USA
  6. 6Department of Statistics, Harvard University, Cambridge, Massachusetts, USA
  1. Correspondence to Dr Xihong Lin; xlin{at}


Objective To develop simple but clinically informative risk stratification tools using a few top demographic factors and biomarkers at COVID-19 diagnosis to predict acute kidney injury (AKI) and death.

Design Retrospective cohort analysis, follow-up from 1 February through 28 May 2020.

Setting 3 teaching hospitals, 2 urban and 1 community-based in the Boston area.

Participants Eligible patients were at least 18 years old, tested COVID-19 positive from 1 February through 28 May 2020, and had at least two serum creatinine measurements within 30 days of a new COVID-19 diagnosis. Exclusion criteria were having chronic kidney disease or having a previous AKI within 3 months of a new COVID-19 diagnosis.

Main outcomes and measures Time from new COVID-19 diagnosis until AKI event, time until death event.

Results Among 3716 patients, there were 1855 (49.9%) males and the average age was 58.6 years (SD 19.2 years). Age, sex, white blood cell, haemoglobin, platelet, C reactive protein (CRP) and D-dimer levels were most strongly associated with AKI and/or death. We created risk scores using these variables predicting AKI within 3 days and death within 30 days of a new COVID-19 diagnosis. Area under the curve (AUC) for predicting AKI within 3 days was 0.785 (95% CI 0.758 to 0.813) and AUC for death within 30 days was 0.861 (95% CI 0.843 to 0.878). Haemoglobin was the most predictive component for AKI, and age the most predictive for death. Predictive accuracies using all study variables were similar to using the simplified scores.

Conclusion Simple risk scores using age, sex, a complete blood cell count, CRP and D-dimer were highly predictive of AKI and death and can help simplify and better inform clinical decision making.

  • COVID-19
  • epidemiology
  • statistics & research methods

Data availability statement

Data are available upon reasonable request. Patient data are not available, but requests for surrogate data may be made to the corresponding authors. However, code for all analyses will be available at

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • Various associations between patient variables and COVID-19 acute kidney injury (AKI) and death have been reported, but it is unclear which variables are most predictive and important to focus on.

  • We developed risk scores for predicting AKI and death among new COVID-19 positive patients.

  • Readily obtainable demographic, vital sign and laboratory values were considered evaluated.

  • Findings are limited to patients without chronic kidney disease.


Although respiratory failure and diffuse inflammatory lung tissue damage are key features of COVID-19, involvement of other organs such as the kidneys has been well documented. Pathological autopsy examinations of COVID-19 kidneys have shown clusters of coronavirus-like particles in the tubular epithelium and podocytes, upregulation of the SARS-CoV-2 receptor ACE 2 and positive immunostaining with SARS-CoV-2 nucleoprotein antibodies.1 2 Haemodynamic instability, systemic hypoxia, abnormal coagulation and inflammation from severe COVID-19 can also directly contribute to acute kidney injury (AKI) and induce acute tubular necrosis.3

Various epidemiological studies from China, Europe and the USA have investigated AKI outcomes among patients with COVID-19. Early studies in China have reported AKI incidences ranging from 0.5% to 15% among hospitalised and outpatient COVID-19 patients.4 5 One UK study found hospitalised COVID-19 patients with AKI had a threefold higher odds of death than those without AKI.6 Large US population studies of hospitalised patients with COVID-19, primarily in the New York City metropolitan area, have reported AKI incidences ranging from 27% to 57%, with in-hospital mortality rates ranging from 35% to 71% among AKI COVID-19 patients.7–10 Some of these studies have also explored variable associations with COVID-19 AKI, but none of these studies have investigated which subset of these variables are most predictive of AKI or built risk predictions models using demographic variables and biomarkers.

Risk prediction tools have been investigated for COVID-19 deaths. A small number of a priori determined biomarkers were investigated for their associations with the risk of COVID-19 death.11 However, a more data-driven approach would compare the predictive accuracies of these biomarkers to other biomarkers and variables such as demographic factors and vital signs and build a more powerful risk prediction model using a comprehensive set of biomarkers, demographic variables and vital signs. Different risk factors should also be weighted differently, and understanding the relative importance of different variables in predicting poor outcomes will allow for more accurate holistic patient evaluations.

In this study, we developed and evaluated new risk assessment tools that can be easily implemented at the bedside or during chart reviews to predict AKI and death after a positive COVID-19 test. Our contributions include (1) identifying the top biomarkers and demographic variables that predict AKI events among patients with COVID-19, (2) investigating a greater number of potential biomarkers and risk factors in predicting death, (3) developing clinical risk assessment tools for both AKI and death using a small number of predictors and (4) validating that these tools are nearly as predictive as using all available study vcomorbidity related medical ariables. By understanding which subset of risk factors are most important to focus on, medical providers can more efficiently workup and risk stratify their newly diagnosed patients with COVID-19.


Study population

The Mass General Brigham (MGB) health system serves a large diverse patient population around Boston and Eastern Massachusetts. Electronic health records (EHR) from three major hospitals in this system (Massachusetts General Hospital in Boston, Brigham and Women’s Hospital in Boston, and Newton-Wellesley Hospital in Newton) were used.

We included all patients that (1) were at least 18 years old, (2) tested COVID-19 positive at one of the three hospitals above between 1 February 2020 and 28 May 2020 and (3) had at least two serum creatinine (SCr) tests within 30 days of their SARS-CoV-2 PCR test. We excluded patients that (1) met the criteria of AKI within 3 months before their SARS-CoV-2 test and (2) had chronic kidney disease (CKD) identified as a pre-existing condition from International Classification of Disease (ICD-9 and ICD-10) codes (see later).

Patient and public involvement

Patients and the public were not involved in the planning of this project.

Data collection

Information in EHR of patients who met the inclusion criteria were extracted from the enterprise data warehouse and included demographic, comorbidities, clinical, laboratory and outcome data (death). Demographic and laboratory data information closest to the time of first SARS-CoV-2 PCR test were kept (except for SCr, multiple values were kept). SCr laboratory test results and timestamps within 3 months before and 30 days after the SARS-CoV-2 PCR test were extracted. We categorised ethnic groups other than white, black, Hispanic and Asian into a single subgroup. All documented comorbidity-related medical history in MGB healthcare system enterprise data warehouse before the first time of SARS-CoV-2 test were extracted. Pre-existing conditions, including hypertension, diabetes, cardiovascular disease and heart failure, were classified using their ICD-9 or ICD-10 codes.

Definitions of outcomes

Per the Kidney Disease Improving Global Outcomes (KDIGO) criteria, AKI was defined as a change in SCr of 0.3 mg/dL over a 48-hour period, a 50% increase in baseline creatinine in 7 days, or urine value<0.5 mL/kg/hour for 6 hours.12 Due to difficulties obtaining accurate urine volumes from electronic health record data, we only use SCr to define AKI events. If patients had more than two SCr tests in their EHR, we used all available SCr tests to define the earliest time. Death times were directly extracted from the data warehouse.

Statistical analyses

Continuous variables were transformed into categorical variables to improve interpretability of results and account for non-linear associations. Counts and percentages were presented, and two proportion z-tests were used to compare the proportion of deaths among patients with AKI and non-AKI. For AKI survival analyses, observations without AKI were censored after 30 days, at the time of death or at 28 May 2020, whichever came first. For death survival analyses, observations without death were censored at 28 May 2020. Multiple multivariable Cox proportional hazards models included age, sex, race, diabetes, cardiovascular disease, hypertension, heart failure, body mass index (BMI), temperature, heart rate, systolic blood pressure, white blood cell (WBC) count, haemoglobin, platelets, C reactive protein (CRP), ferritin and D-dimer. Respiratory rate and interleukin 6 (IL-6) variables were not included in primary analyses given missing data. However, we performed exploratory analyses imputing the missing respiratory rate and IL-6 values (additional details are in the Sensitivity analysis section).

We next built a simplified Cox model for clinical use by using a stepwise variable selection procedure for Cox models alternating between ‘forward’ and ‘backwards’ steps to identify the first five variables to be included.13 Simplified Cox models were fit using only the selected five variables and Harrell’s C-Statistics were obtained (survival outcome). Model coefficient (linear prediction) accuracy was evaluated. We evaluated area under receive operating characteristic (ROC) curves (AUC) for predicting AKI within 3 days and death within 30 days of a new COVID-19 diagnosis (binary outcome). Net reclassification improvement (NRI) of adding all remaining covariates was also calculated.

Risk scores were obtained by rounding simplified model coefficients for easier clinical risk assessment use. For suggested risk score cutoffs, Kaplan-Meier event curves were plotted, log rank tests were performed and sensitivities, specificities, positive and negative likelihood ratios were calculated. Approximate pretest to post-test probability changes from likelihood ratios were calculated using the linear approximation proposed by McGee.14 Cutoffs for low risk were chosen so that the negative likelihood ratio would be ≈0.20 with a pretest to post-test probability decrease of ≈30%, while cutoffs for high risk were chosen so that the positive likelihood ratio would be ≈5.0 with a pretest to post-test probability increase of ≈30% and that at least 15% of patients (560) would be identified as high risk.14 We ran 1000 internal cross validation iterations in which 70% of data were randomly assigned to training, the other 30% to testing. For each iteration, simplified Cox models were fit to the training data, coefficients were rounded to obtain risk scores and AUC’s were calculated using the predicted testing data risk scores.

We performed four sensitivity analyses. First, the multivariable cause-specific and subdistribution hazard to documented AKI events within 30 days accounting for the competing risk of death was modelled.15 Second, we performed a multiple imputation analysis by creating 10 imputation datasets with imputed values for missing respiratory rate and IL-6 and then calculating pooled multivariable Cox HRs.16 Third, we investigated the AKI risk score accuracy in identifying stage 2 or 3 AKI as defined in the KDIGO criteria, and we investigated the death score accuracy among patients with stage 2 or 3 AKI.12 Fourth, we investigated including mechanical ventilation (non-invasive and invasive) and lymphopenia defined as lymphocytes <800 cells/mm3 as covariates for modelling AKI and death events. All analyses were performed with R V.4.0.4 and all code for analyses are available online (to be posted during revisions).


Demographic and clinical characteristics

There were 3716 eligible adult COVID-19 positive patients without CKD, of which 1855 (49.9%) were male. The average age was 58.6 years (SD 19.2 years). There were 696 patients that developed AKI (18.7%) and 249 (35.8%) were within 3 days of a new COVID-19 diagnosis. There were 347 deaths (9.3%) and 322 (92.8%) were within 30 days of a new COVID-19 diagnosis. Among the AKI group there were 192 deaths (27.6%), and among the non-AKI group there were 155 deaths (5.1%, p<0.001). Patient demographics, pre-existing conditions, vital signs and laboratory values stratified by patients with AKI and patients that died are displayed in table 1. Patients with AKI and patients that died were more likely to be older, male, have multiple comorbidities, and have on admission higher temperatures, lower systolic blood pressures, higher respiratory rates, elevated WBC counts, lower haemoglobin and platelets, and elevated CRP, ferritin, D-dimer and IL-6 levels.

Table 1

Characteristics of patients with AKI, patients that died and all patients

Fully adjusted multivariable regression

Multivariable Cox regression was performed to identify risk factors associated with time to AKI and death. Table 2 displays pooled multivariable adjusted HRs. Adjusting for all other variables, older age, increased medical conditions, increased temperature, decreased systolic blood pressure, increased WBC, decreased platelets and increased CRP and D-dimer were associated with increased hazards for both AKI and death. Elevated BMI, decreased haemoglobin and increased ferritin were associated with increased hazards for AKI but not death. Black and Asian race were associated with decreased hazards and increased heart rate was associated with increased hazards for death but not AKI.

Table 2

Multivariable Cox regression results

Top risk factor/biomarker selection

The top five variables selected for being most associated with AKI events were haemoglobin, D-dimer, CRP, WBC and male sex. The top five variables most associated with death were age, CRP, platelets, WBC and D-dimer. Online supplemental table S1 shows model coefficients and Harrell’s C-statistic (survival concordance) from the simplified model using just these selected variables. Online supplemental table S2 shows similar results for the fully adjusted model. The simplified AKI Cox model had a survival C-statistic of 0.785 (95% CI 0.769 to 0.800), while the fully adjusted AKI Cox model had a C-statistic of 0.813 (95% CI 0.798 to 0.827). The simplified death Cox model had a survival C-statistic of 0.857 (95% CI 0.841 to 0.874), while the fully adjusted death Cox model had a C-statistic of 0.878 (95% CI 0.863 to 0.892).

Cox model coefficients were used to predict AKI events within 3 days and death within 30 days of a new COVID-19 diagnosis (binary outcomes). Online supplemental tables S1 and S2 also display AUC’s for the simplified and fully adjusted coefficients, respectively. For AKI in 3 days, using the simplified coefficients had an AUC of 0.787 (95% CI 0.759 to 0.814), using the fully adjusted coefficients had an AUC of 0.820 (95% CI 0.794 to 0.845) and the NRI was 0.041 (95% CI 0.003 to 0.082). For death in 30 days, using the simplified coefficients had an AUC of 0.872 (95% CI 0.854 to 0.890), the fully adjusted coefficients had an AUC of 0.893 (95% CI 0.878 to 0.909) and the NRI was 0.010 (95% CI −0.007 to 0.029).

Risk score

Model coefficients were rounded to obtain risk score component values for easier clinical use. Table 3 shows the risk score and internal validation results. For AKI in 3 days, the risk score had an AUC 0.785 (95% CI 0.758 to 0.813) and a cross validation AUC of 0.776 (95% CI 0.732 to 0.816). For death in 30 days, the risk score had an AUC of 0.861 (95% CI 0.843 to 0.878) and a cross validation AUC of 0.860 (95% CI 0.831 to 0.886). Figure 1A plots ROC curves for using fully adjusted coefficients (from online supplemental table S2) versus using risk scores (from table 3) in predicting AKI in 3 days and death in 30 days.

Figure 1

Receiver operating characteristics and Kaplan-Meier event curves using selected variables. (A) Receiver operating characteristic (ROC) curves for acute kidney injury (AKI) within 3 days and death within 30 days using fully adjusted model coefficients and developed risk score. Each line represents a different model’s predictions with the given variables. (B) Kaplan-Meier event curves for AKI events and death events stratified by AKI and death scores. Time begins at positive COVID-19 test. CRP, C reactive protein; HGB, haemoglobin; PLT, platelets; WBC, white blood cells.

Table 3

Risk score and internal validation results

Suggested risk stratification cutoffs were obtained. Online supplemental table S3 presents sensitivity, specificity and positive and negative likelihood ratios for all possible risk score cutoffs. Table 4 shows suggested risk stratification cutoffs and stratified observed and estimated event rates. Higher risk scores had higher observed and estimated AKI and death rates. Figure 1B plots Kaplan Meier event curves of AKI and death events by simplified risk score categories. Event rates are different by risk category for AKI (p<0.001) and death (p<0.001).

Table 4

Suggested risk stratification cutoffs and observed and estimated event rates

Sensitivity analysis

We performed a competing risk regression analysis for AKI and death within 30 days. Online supplemental table S4 displays the multivariable cause-specific and subdistribution HRs for AKI events. Cause-specific and subdistribution HR estimates and CIs were nearly identical. We also performed a multiple imputation analysis by imputing missing values for respiratory rate and IL-6 to evaluate their associations. Online supplemental table S5 shows that results were similar to non-imputation results, and increased respiratory rate and IL-6 were associated with increased hazards of AKI and death.

Of the 696 patients with an AKI event, 580 had a stage 1 AKI (83.3%), 29 had stage 2 (4.2%) and 87 had stage 3 (12.5%). Of the 117 patients with stage 2 or 3 AKI, there were 39 deaths (33.6%). In predicting stage 2 or 3 AKI as a single composite outcome among all 3716 patients, the AKI risk score in table 3 had an AUC of 0.850 (95% CI 0.819 to 0.881). In predicting death among the 117 patients with stage 2 or 3 AKI, the death risk score in table 3 had an AUC of 0.758 (95% CI 0.671 to 0.846).

Of the 696 patients with an AKI event, 207 (29.7%) had lymphopenia with lymphocytes <800 cells/mm3 and 328 (47.1%) had received mechanical ventilation. Of the 347 patients with a death event, 150 (43.2%) had lymphopenia and 124 (35.7%) had received mechanical ventilation. Of all 3716 patients, 690 (18.6%) had lymphopenia and 449 (12.1%) had received mechanical ventilation. For AKI in 3 days, the fully adjusted coefficients in the primary analyses had an AUC of 0.820 (95% CI 0.794 to 0.845) while additionally adding lymphopenia and mechanical ventilation only increased the AUC to 0.838 (95% CI 0.814 to 0.861). For death in 30 days, the fully adjusted coefficients had an AUC of 0.893 (95% CI 0.878 to 0.909), while additionally adding lymphopenia and mechanical ventilation only increased the AUC to 0.906 (95% CI 0.893 to 0.920).


In this retrospective study of over 3700 adult patients without chronic kidney disease diagnosed with COVID-19 through May 2020 in the Boston area, we identified risk factors and biomarkers associated with AKI and death, and we developed and internally validated risk scores for predicting AKI and death. We found about one in five patients developed AKI and one in ten patients died. Increased age, male sex, increased WBC, CRP and D-dimer and decreased haemoglobin and platelet levels were associated with AKI within 3 days and/or death within 30 days of a new COVID-19 diagnosis. A risk score using just these variables had similar internal accuracy as using all study variables. These results can assist in risk stratification of patients with COVID-19 without CKD.

Many studies have found markedly increased COVID-19 fatality rates among older people. Studies from China, Spain and Italy, and a meta-analysis of studies from 34 different geographical locations have all found increased case or infection fatality rates among people >60 and >65 years old compared with younger populations.17–20 We similarly observed older age had some of the strongest associations with death. Earlier studies have found various physiological changes among elderly patients that may contribute to this age-related risk, such as decreased small airway clearance, decreased number of cilia and ciliated cells and decreased upper airway size.21–23

Other studies have also reported worse COVID-19 outcomes among men. A study of over 3300 patients in Montefiore Medical Center found male sex was associated with AKI in both COVID-19 positive and negative patients.8 This study also provided a more complete discussion of other animal studies and meta-analyses to date that have found associations between male sex and AKI in general. Studies of COVID-19 outcomes from March 2020 in Italy and the USA also reported increased hospitalisation and intensive care unit admission rates among male patients.24 25 We similarly observed that patients with AKI (60.3%) and patients that died (58.2%) were more likely to be male (overall 49.9%). However, after adjusting for other demographics, medical conditions, vital signs and laboratory values, we found male sex was associated with AKI but not death.

Among laboratory values, CRP, haemoglobin, WBC, D-dimer and platelets were significantly associated with AKI and death and were included in risk scores. Although there has been debate about a standard definition for COVID-19 cytokine storm syndrome, patients with CRP may have excessive immune activation, with CRP being produced by hepatocytes in response to IL-6 or ferritin.26 Decreased haemoglobin may be reflective of kidney disease with decreased erythropoietin production or directly lead to decreased oxygenation of the kidneys. A study from Korea also found a higher risk of AKI in critically ill patients with a haemoglobin<10.5 g/L.27 Elevated WBC counts may suggest sepsis and be associated with life-threatening organ dysfunction.28 Elevated D-dimer levels may be indicative of a prothrombotic state, and a retrospective study from China found that D-dimer >2000 ng/mL was associated with increased mortality.29 However, D-dimer levels have also been reported to be elevated at baseline in CKD patients,30 so it is possible elevated D-dimer may only be prognostic in non-CKD patients. Low platelets may also indicate a systemic coagulopathic process that places patients at an increased risk for death.28

The biomarker IL-6 was found to be a significant risk factor in regression analyses. However, a substantial proportion of patients in our study were missing IL-6 values (78.3%), so IL-6 was not considered for risk score development. IL-6 measurements were obtained at physician discretion and were likely reserved for severe cases. This may have also contributed to the missingness profile of IL-6. Previous studies have found IL-6 cutoffs of 80 and 86 IU/mL to have prognostic value for predicting respiratory failure and death, respectively.31 32

We proposed risk scores for identifying AKI within 3 days and death within 30 days of a new COVID-19 diagnosis along with suggested cutoffs. Although risk scores still need to be externally validated, being able to identify a few key biomarkers that are widely accessible can help focus chart reviews of new COVID-19 positive patients. Varying score weights further highlight biomarkers to focus on, such as haemoglobin and male sex for AKI; age and platelets for death; and WBC, CRP and D-dimer for both. Larger scores directly correlate with worse outcomes and can help shape physician gestalt.

We explored death being a competing risk for AKI events as patients with death will not have any more creatinine measurements. Although an AKI does not exclude the possibility of death, competing risk analyses can still be performed investigating which event type occurs first.15 The cause-specific HRs (Cox HRs) describe the rate of AKI events among those still alive and with no previous AKI events, while the subdistribution HRs describe the overall rate of AKI events occurring before death. In our study both cause-specific and subdistribution HRs were similar. Competing risk analyses were not performed for death events as having an AKI does not exclude death.

We looked at a subgroup of patients which developed stage 2 or 3 AKI. Our AKI risk score also performed well in identifying patients who developed stage 2 or 3 AKI, suggesting higher risk scores also correlate with developing a higher stage AKI. Among patients who developed stage 2 or 3 AKI, the death risk score AUC had a larger CI likely because of the smaller sample size and smaller number of death events.

We explored including lymphopenia and mechanical ventilation as variables in analyses. Lymphopenia has been found to be associated with greater COVID-19 disease severity and poorer outcomes,33 and hypoxaemia requiring mechanical ventilation may affect kidney perfusion and also lead to poorer outcomes. We found that additionally including lymphopenia and mechanical ventilation to our study variables only led to small improvements in AUC in predicting AKI and death events. Nonetheless, we expect that patients who score high on our risk scores but also have lymphopenia and/or require mechanical ventilation will be at even greater risk of AKI and death events. Future work can further investigate including these variables into risk scores.

Limitations to our study include the following. All results are associational and no causal effects should be interpreted. Vital signs and laboratory values were those closest to COVID-19 diagnosis, and time-varying covariates were not incorporated into analyses. As the study was retrospective, selection bias cannot be excluded, and only events within the MGB system were recorded. Our identified risk factors and risk scores are most applicable during a patient’s initial COVID-19 positive test. Results may not be generalisable to more specific subgroups such as those requiring intensive care admission. Patients in the Boston area may not be reflective of those in other healthcare systems, and the study population included only COVID-19 positive patients without CKD. The study population included patients in the first wave of COVID-19, and results should be cautiously applied to subsequent waves of COVID-19 due to differences in COVID-19 variants and treatment protocols. Future work may further stratify AKI events by stage and time of acquisition (relative to hospital admission), investigate outpatient, hospitalised and critically ill patients separately, focus on CKD patients, validate results on a separate cohort, explore hospital specific effects, and include medication use such as renin-angiotensin-aldosterone system inhibitors.

We investigated AKI and death outcomes among adult patients with COVID-19 without CKD in the Boston area. We identified risk factors and developed and evaluated risk assessment tools for identifying patients with COVID-19 developing AKI and death. Haemoglobin, D-dimer, CRP, WBC and male sex were the strongest predictive biomarkers for AKI. Age, CRP, platelets, WBC and D-dimer were most predictive for death. Our study significantly contributes to epidemiological knowledge of COVID-19 outcomes and introduces simple tools to assist with rapid risk assessment.

Data availability statement

Data are available upon reasonable request. Patient data are not available, but requests for surrogate data may be made to the corresponding authors. However, code for all analyses will be available at

Ethics statements

Patient consent for publication

Ethics approval

The Mass General Brigham Institutional Review Board approved this study, and the approval number was 2020P001661. Only deidentified patient electronic health record data were used.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Contributors DL and HR drafted the manuscript. HR, PX, DW obtained the data. DL performed the analyses. DL, HR, DJV, PS, QL and XL contributed to the design of the study. All authors were involved with interpretation of the data and critical revision and final approval of the article. XL accepts full responsibility for the work and/or the conduct of the study, had access to the data, and controlled the decision to publish. XL acts as a guarantor.

  • Funding This work was supported by the National Institutes of Health grant number T32-GM135117 (DL).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.