Objective To provide an overview of the currently available risk prediction models (RPMs) for cardiovascular diseases (CVDs), diabetes and hypertension, and to compare their effectiveness in proper recognition of patients at risk of developing these diseases.
Design Umbrella systematic review.
Data sources PubMed, Scopus, Cochrane Library.
Eligibility criteria Systematic reviews or meta-analysis examining and comparing performances of RPMs for CVDs, hypertension or diabetes in healthy adult (18–65 years old) population, published in English language.
Data extraction and synthesis Data were extracted according to the following parameters: number of studies included, intervention (RPMs applied/assessed), comparison, performance, validation and outcomes. A narrative synthesis was performed. Data were reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines.
Study selection 3612 studies were identified. After title/abstract screening and removal of duplicate articles, 37 studies met the eligibility criteria. After reading the full text, 13 were deemed relevant for inclusion. Three further papers from the reference lists of these articles were then added.
Study appraisal The methodological quality of the included studies was assessed using the AMSTAR tool.
Risk of bias in individual studies Risk of Bias evaluation was carried out using the ROBIS tool.
Results Sixteen studies met the inclusion criteria: six focused on diabetes, two on hypertension and eight on CVDs. Globally, prediction models for diabetes and hypertension showed no significant difference in effectiveness. Conversely, some promising differences among prediction tools were highlighted for CVDs. The Ankle-Brachial Index, in association with the Framingham tool, and QRISK scores provided some evidence of a certain superiority compared with Framingham alone.
Limitations Due to the significant heterogeneity of the studies, it was not possible to perform a meta-analysis. The electronic search was limited to studies in English and to three major international databases (MEDLINE/PubMed, Scopus and Cochrane Library), with additional works derived from the reference list of other studies; grey literature with unpublished documents was not included in the search. Furthermore, no assessment of potential adverse effects of RPMs was carried out.
Conclusions Consistent evidence is available only for CVD prediction: the Framingham score, alone or in combination with the Ankle-Brachial Index, and the QRISK score can be confirmed as the gold standard. Further efforts should not be concentrated on creating new scores, but rather on performing external validation of the existing ones, in particular on high-risk groups. Benefits could be further improved by supplementing existing models with information on lifestyle, personal habits, family and employment history, social network relationships, income and education.
PROSPERO registration number CRD42018088012.
- cardiovascular diseases
- risk prediction models
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
This is the most comprehensive umbrella systematic review on risk prediction models for cardiovascular diseases, hypertension and diabetes to date.
Available studies, although apparently of medium-to-high quality, were based on primary studies of debatable quality, several of which lack discrimination and calibration assessments.
Grey literature was not searched.
Heterogeneity was too high for meta-analysis; results are, therefore, reported narratively.
Cardiovascular diseases (CVDs), hypertension and diabetes represent a major health concern, throughout the world and across income levels, as a silent epidemic responsible for millions of deaths every year.
CVDs, excluding hypertension, are the leading cause of death worldwide; they have a global prevalence rate of 6.6% and account for 17.6 million deaths per year.1
In 2010, almost one of every three adults (31.1%) had hypertension, although there was a significant gap in prevalence rates between high-income and low-income countries: 28.5% (27.3%–29.7%) and 31.5% (30.2%–32.9%), respectively, worldwide.2
Diabetes (types 1 and 2 combined) has a global prevalence rate of 5.4% and is responsible for 1.4 million deaths every year.3
In addition to being a significant cause of mortality worldwide, CVDs, hypertension and diabetes are also a leading cause of disability. Together they account for almost 40% of disability-adjusted life years (DALYs): CVDs alone are responsible for 32.3% of total DALYs, hypertension for 3.7% and diabetes for 2.4%.4
The epidemiology of these three diseases helps explain the substantial economic impact they have on national health services: in the USA, CVD-related direct costs amount to approximately US$444 billion per year,5 whereas the costs of diabetes are estimated as US$327 billion per year; hypertension has an annual estimated cost of US$51 billion,6 most of which (nearly US$48 billion) represents direct medical expenses.7
In recent decades, hypertension and diabetes have shown an increasing trend in both prevalence and mortality rates. It has been estimated that the prevalence of diabetes will continue to grow: one in five to one in three adults will be affected by 2050. The same is true for mortality rate trends: in the last 15 years, diabetes-related deaths increased by 1%, and these data are expected to climb dramatically over the coming decades.8
CVDs, hypertension and diabetes are strongly related to each other: diabetes is associated with an increased risk of CVDs, which is exaggerated by concomitant hypertension. These conditions also share the same pathogenic pathways, at both macroscopic and molecular level: oxidative stress, inflammation and fibrosis, cause microvascular and macrovascular complications in diabetes, and also lead to vascular remodelling and dysfunctions in hypertension.13
Because of the high prevalence and mortality rates associated with CVDs, hypertension and diabetes, and their related direct and indirect costs, early identification of individuals at high risk for these diseases is crucial; it results in terms of obtaining significant savings in both global health outcomes and economic expenditures.
A number of prediction models focused on these three non-communicable diseases (NCDs) are available, but there is no consensus as to the gold-standard tools best used in practice.
The aim of this study is to provide an overview of the currently available risk prediction models (RPMs) for CVDs, diabetes and hypertension and to compare their effectiveness at properly recognising vulnerable people, at risk of developing these NCDs.
This umbrella systematic review was performed following a protocol designed a priori, and reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines.14
The PubMed, Scopus and Cochrane databases were searched electronically on, 24 September 2019, using combinations of the relevant medical subject heading terms, key words and word variants for ‘risk prediction scores’ and ‘CVD’, ‘diabetes’ and ‘hypertension’, as shown in table 1. The search and selection criteria were restricted to the following: systematic reviews, with or without meta-analysis, as the type of study; general population aged 18–65 with no major illness; comparison of at least two RPMs; English language. For the PubMed database, only two further filters were added: only articles on humans and with abstracts available were included. No restrictions were applied in terms of publication date in any of the databases.
Reference lists of relevant articles and reviews were hand-searched for additional reports.
Two different authors independently screened the article titles and abstracts in each database: MM and AA for Scopus, AV and DCM for PubMed/Medline, AV and AA for Cochrane Library.
Disagreements were discussed by the authors and resolved by consensus or by recourse to a third author (FL). Studies were then labelled for inclusion or exclusion.
Every article meeting the eligibility criteria—systematic reviews/meta-analyses on RPMs on CVDs, diabetes and hypertension, evaluated by comparison with other RPMs, in adults with no relevant illness—was considered for subsequent qualitative synthesis; duplicate records were removed, as—were articles that included the exclusion criteria: any study carried out with the sole purpose of developing a new RPM, or validating one, or to propose a diagnostic/prognostic tool, without any comparisons to other prediction models.
Studies were henceforth labelled for inclusion or exclusion.
The selection process described above is summarised by the flow diagram shown in figure 1.
Four authors (AA, DCM, MM and AV) extracted the data, including the following variables: number of the included studies; intervention (RPMs); comparison; performance: area under the receiver-operating characteristic curve (AUC), C-statistic, D-statistic); validation: internal, external, both or not provided; outcomes: incidence, prevalence, mortality.
Assessment of study quality
Quality assessment of individual studies was performed by applying the AMSTAR tool.15 According to their score, articles were classified into three groups: low (AMSTAR score <4), medium (AMSTAR score ≥4 and≤7) and high quality (AMSTAR score ≥8).
Four authors (AA, DCM, MM and AV) independently assigned score. Disagreements were resolved by consensus or by discussion with a fifth author (FL). No reviews were excluded ex-post for quality reasons.
A qualitative (narrative) synthesis of the selected reviews was then performed; the guidelines for umbrella review from the Joanna Briggs Institute were applied.16
Risk of bias in individual studies
Risk of bias in individual studies was independently evaluated by four authors (AA, DCM, MM and AV), using the ROBIS tool.17 Any disagreements were resolved by consensus or by discussion with a fifth author (FL).
Patient and public involvement
As the study design was a systematic review, neither patients nor the public were involved.
A total of 3612 studies were identified through a search of the electronic databases.
After title screening, 3463 studies were excluded because they did not meet the eligibility criteria and nine duplicate articles were removed. Of the 149 studies passing the first evaluation stage, 102 studies were excluded because the topic was not pertinent, sample characteristics were inadequate (ie, some articles included non-healthy populations), the article type did not meet the inclusion criteria (ie, some studies were not systematic reviews), or there was no comparison between models.
After reading the full text of the 37 remaining studies, 13 were deemed relevant for inclusion. Twenty-four studies were excluded because of a lack of comparison model, related but non-pertinent topic or wrong article type.
Three additional studies were included after consulting reference lists of relevant articles and reviews overall, 16 studies met the eligibility criteria and were included in qualitative synthesis.
Quality assessment of the studies
According to the AMSTAR tool, all of the studies were of medium-to-high quality, with a mean score of 7.14, (range 5–11). Specifically, 8 of 15 (53.4%) were of medium quality (AMSTAR ≥4 and≤7), and 7 of 15 (46.6%) were of high quality (AMSTAR >8).
The results of quality assessment have been summarised in graph form (figure 2).
Risk of bias within studies
Globally, only 9 articles of 16 (56.25%) had a low risk of bias, according to the ROBIS tool.
Two articles had a high risk of bias due to the eligibility criteria: search limitation on English language. Eight articles had a high risk of bias due to the identification and selection process: the most common source of bias was the search limitation to a single database. Nine articles had a high risk of bias due to data collection and study appraisal, in particular because of the lack of formal appraisal tools. Finally, one article had a high risk of bias due to the synthesis and identification process, mainly due to significant heterogeneity of primary sources. Given the overall heterogeneity of the retrieved studies, it was impossible to conduct a meta-analysis.
Risk of bias across systematic reviews
We identified all of the ongoing SRs that met our inclusion criteria by searching the PROSPERO database, in order to assess publication bias.
Eight ongoing systematic reviews were found, including the present study, and none have been published to date.
Results of individual studies
Overall, 16 studies met the inclusion criteria, 8 of them concerning diabetes, 6 CVDs and 2 hypertension.
The results for all the included studies are summarised qualitatively in table 3.
Studies on diabetes
Abbasi et al18 (AMSTAR 6/11) focused on 16 prospective cohort studies, in order to validate 25 risk-predictive models for type 2 diabetes mellitus(T2DM), by means of an external validation cohort. The sample was included 38 379 people aged 20–70 with no diabetes at the baseline. Incidence of type 2 diabetes was evaluated as outcome. All included studies reported a C-statistic, ranging from 0.74 to 0.84 for risk at 7.5 years, indicating a good discriminatory capability. The risk models had an estimate of calibration, the Hosmer-Lemeshow test, generally indicating good calibration.
Barber et al19 (AMSTAR 6/11) assessed the applicability of 18 risk assessment tools in individuals with pre-diabetes. Their systematic review included 12 studies, with sample sizes ranging from 1351 to 7092. Incidence of pre-diabetes, defined according to American Diabetes Association criteria, was considered as the primary outcome.
Validation (either internal or external) of the risk scores was achieved by evaluating both discrimination and calibration. The internal C-statistic ranged from 0.66 to 0.75. Calibration was described by Hosmer-Lemeshow goodness-of-fit test value, reported by only two studies and with discordant results.
Collins et al20 (AMSTAR 6/11) evaluated RPMs for type 2 diabetes, including 39 studies comparing 47 different risk tools. The studies had a median sample size of 2562 people, with an IQR from 1426 to 4965. No quantitative information was available on discrimination or calibration.
Hu et al21 (AMSTAR 5/11) evaluated the effectiveness of risk-predictive models for type 2 diabetes in the Asian population. Their systematic review included 43 studies examining 12 risk-predictive models, derived from population samples ranging from 2677 to 73 961.
Discrimination was evaluated by the AUC: this showed a high variability of results (AUC 0.66 to 0.91).
Noble et al22 (AMSTAR 5/11) conducted a systematic review assessing 94 risk models for type 2 diabetes.
They evaluated 43 perspective cohort studies (sample size ranging from 399 to 2.54 million people) and incidence of diabetes was the primary outcome. Some of the risk models had been externally validated on a different population. The C-statistic index showed high variability, fluctuating between not acceptable (0.60) and good quality (0.91) scores. The same results applied for calibration indicators.
Yoshizawa et al23 (AMSTAR 8/11) focused on evaluating the predictive ability of a non-blood-based RPM for incidence of T2DM. The 18 eligible studies included an overall number of 184 011 participants aged 42.4–68.4 years. Discrimination, evaluated by the AUC, was adequate to good (0.72–0.81).
Studies on CVDs
Cortes-Bergoderi et al24 (AMSTAR 6/11) assessed the validity of RPMs in Latin America and in US people of Hispanic descent. Their review included five cohort studies, comparing the Framingham score with three risk models for CVD and one for Chagas disease, and investigating incidence and mortality as outcomes. Risk score calibration measured by C-statistic index was good, (0.69–0.80). While the Authors openly admit that the Framingham score needs to be recalibrated for Latin American populations, they also recognise that evidence regarding CVD risk models is ‘modest at best’. Indeed, all of the included studies showed a ratio of predicted/observed that was not significant.
Tzoulaki et al25 (AMSTAR: 5/11) focused on 79 studies on Framingham-based improving models, derived from populations of less than 1000 to over 10 000 subjects from the USA and the UK. Incidence and mortality for coronary heart disease (CHD) were measured as outcome.
The discrimination ability of the examined scores, evaluated by AUC, varied from not acceptable to good: the FRS alone model showed an area under the curve between 0.50 and 0.83, whereas FRS with additional predictors ranged from .57 to 0.84.
Fowkes et al26 (AMSTAR 6/11) evaluated the Ankle-Brachial Index (ABI) as predictor of cardiovascular events and mortality, compared with the Framingham Risk Score. They included 20 prospective studies involving general populations from the EU and the USA, with sample sizes that ranged from 554 to 14 109.
The combination of ABI and FRS risk prediction scores had a higher discriminating power compared with FRS alone (0.655 vs 0.646 among men, and 0.658 vs 0.605 among women).
Incidence of CVDs was assessed, as primary outcome, using adjusted HR estimates.
The study results showed that ABI measurement can be used in addition to FRS to improve its predictive power, and the Authors suggested that a combined tool could be useful.
Siontis et al27 (AMSTAR 9/11) performed a comparison of eight RPMs for CVD. Their review included 20 prospective and retrospective studies, with sample sizes ranging from 403 to 1 072 800. The main outcomes considered were CVD mortality and CVD-related incidence. The probability for prediction of outcome varied significantly among the studies, from poor (0.55) to good (0.85).
Damen et al28 (AMSTAR: 7/11) conducted a systematic review examining 212 studies that described the development of 363 different prediction models for CVD and CHD. Sample size was extremely variable, ranging between 51 and 1 189 845 people, mainly from Europe, Canada and the USA.
Measures of predictive performance were reported in 53% only of the studies, with discriminatory ability from 0.61 to 1.00.
In addition, an external validation test was performed on 136 articles and most often concerned four models: Framingham, SCORE, QRISK and Adult Treatment Panel (ATP III).
The median discriminative ability was always acceptable (0.70–0.79), except for the ATPIII score (C-statistic index: 0.66). Calibration was estimated as observed: expected ratio, ranged from 0.59 of Framingham-Wilson to 0.94 of QRISK.
Beswick et al29 (AMSTAR 11/11) included 30 articles that evaluated several risk prediction methods for CHD and CVD: 16 studies using convergent validation of Framingham-Anderson-based methods and 21 comparisons used different risk scoring methods. The enrolled samples involved 4540 to over 205 000 people, aged 5–70 years, from USA, Australia, Europe and India. Incidence and mortality for CHD and CVD were estimated as primary outcomes.
Only the most recent updates to the Sheffield tables and the Joint British charts showed acceptable sensitivity and specificity compared with the Framingham-Anderson model.
In addition, Beswick et al performed a second systematic review of external validation of Framingham-based risk scoring methods, based on 62 longitudinal or cross-sectional studies conducted on 112 different populations.
The results indicated extreme variability in discriminatory ability, with areas under the curve ranging from not acceptable (0.58) to good (0.85), the results were better in women than in men or in people with more recent baseline examinations.
Concerning calibration, the predicted:observed ratios ranged from an underprediction of 0.43 to an overprediction of 2.87. Generally speaking, underprediction was greater in people at higher risk, such as subjects with a family history of premature CVD, and lower in people at lower risk.
Echouffo-Tcheugui et al30 (AMSTAR: 6/11) focused on 13 studies that evaluated 28 heart failure RPMs. Studies were based on a US and European cohort of 725 to 3 59 947 subjects, over 18 years of age. Assessed scores had acceptable-to-good discriminatory ability, with C-statistics ranging between 0.71 and 0.87.
Calibration, when reported, was generally acceptable. Only two models were externally validated and showed modest-to-acceptable discrimination, with C-statistics from 0.61 to 0.79.
Damen et al31 (AMSTAR:9/11) included 38 studies and compared the performance of the Framingham ATP III, the Framingham Wilson model and the pooled cohort equations (PCE) for fatal or non-fatalCHD (Framingham Wilson and ATP III) and hard atherosclerotic CVD (PCE). Results for men and women were compared separately. The authors performed meta-analyses of the included studies, calibration was assessed through the observed versus expected (OE) ratio and discriminative power, through the C-statistic, for 10-year risk predictions. The OE ratio results were very heterogeneous, ranging from 0.58 to 0.79. C-statistic values were highly variable as well (from 0.58 to 0.82). Most of the studies showed overprediction of the expected events, especially in high-risk groups. According to the authors, RPMs for CVDs and CHDs showed a similar performance.
Studies on hypertension
Sun et al32 (AMSTAR 9/11) included 26 cohort studies on hypertension that assessed 48 risk models and included both traditional risk factors—body mass index, age, smoking, blood pressure level and parental history of hypertension—with biochemical parameters and genetic factors. Evaluated articles were based on samples drawn mainly from populations in the USA, Eastern Asia and the UK and included a population aged over 20, with a sample size that ranged from 443 to 17.471. All the studies included reported a C-statistic index ranging from 0.74 to 0.79, indicating a good discriminatory capability. Furthermore, calibration estimates of most studies by Hosmer-Lemeshow test showed no significant results.
Echouffo-Tcheugui et al 201333 (AMSTAR 9/11) assessed 11 prospective cohort studies that evaluated 15 different risk models for hypertension in population samples from 1135 to 11 407 subjects from US and Eastern Asian populations. Incidence of hypertension was considered as primary outcome. The C-statistic ranged from 0.80 to 0.70, indicating good performance and discrimination. Ten models also estimated calibration, using the Hosmer-Lemeshow test, and generally reported good calibration.
Discussion and conclusions
Summary of evidence
Developing a good predictive score to enable early identification of diabetes, hypertension and CVDs are the major public health concern across countries of all income levels because of the extremely high rates of incidence and prevalence of these diseases, their upward trend worldwide and their massive consumption of social, health and economic resources.
In spite of the amount of evidence on the issue, the average quality of the existing primary studies is poor: they lack external validation, model calibration and standardised study design, and suffer from optimism bias. In fact, a number of searches showed that older and more limited RPMs performed better than newer, more complex models.27
The majority of studies, in particular those predicting diabetes, reported comparisons that were often achieved with scores that were very similar in prediction model tools, including those differing by a very small number of items (sometimes only one) and those focused on the same population. It is, therefore, not surprising that no RPMs on diabetes and hypertension have seemed to excel: no significant difference was found in the majority of studies on these two diseases.
Conversely, some promising differences among prediction tools were highlighted for CVDs and CHDs. The new RPMs investigated generally used Framingham scores as the main comparison tool.
According to Fowkes et al,26 the ABI, in association with the Framingham tool, improved performance results, although only slightly. In addition, QRISK scores provided some evidence of superiority compared with Framingham, in particular in the areas of calibration and discrimination performance.28 However, it should be pointed out that Framingham-based methods underestimated risk in diabetics, socioeconomically deprived populations, and in patients with a strong family history of premature CVD.29 Because of the limitations described in the available studies, and because no predictive model was clearly identified as superior, it seems legitimate to question whether investing in new risk models is still a good practice, or if it would be a better approach to focus our efforts on external validation of existing tools.
Strengths and limitations
To the best of our knowledge, this is the most comprehensive umbrella systematic review on RPMs for NCDs, such as diabetes and CVDs, with high incidence, prevalence and mortality worldwide. In fact, no other umbrella systematic reviews are available on this issue. The only umbrella systematic review found34 was focused solely on hypertension. For this review, topics were selected according to the following criteria:
Epidemiological relevance in terms of incidence and prevalence.
The significant link between Diabetes, CVD and hypertension in terms of pathogenic pathway and clinical presentation.
This is also the reason why cancer was not considered among the inclusion criteria.
Many studies had been conducted on NCDs, in particular during the last decade, and the authors have, therefore, chosen to use an umbrella methodology for systematic reviews and meta-analyses. Unfortunately, available studies, although reported to be of medium-to-high quality according to AMSTAR score (mean 8.07 out of 11, ranging from 5 to 11), were based on primary studies of debatable quality, with a large proportion of them lacking discrimination and calibration assessments.
Due to the significant heterogeneity of study designs, the risk models involved and the outcomes reported, it was not possible to conduct a meta-analysis. The results were, therefore, reported narratively.
Our electronic search was limited to studies in English and to three major international databases (MEDLINE/PubMed, Scopus and Cochrane Library), with additional works derived from the reference list of studies, and did not include grey literature with unpublished documents. However, a publication bias was estimated by the quantification of ongoing and non-completed systematic reviews in the PROSPERO database.
No assessment of potential adverse effects of RPMs has been carried out, with a potential risk of bias.
General interpretation of results
The increasing global growth in prevalence of chronic diseases, as a direct consequence of epidemiological transition, has led to broader use of predictive tools as a major aid for health workers. Indeed, these instruments can be very important and should be regularly implemented in medical settings to support the activity of general practitioners and public health authorities involved in monitoring and evaluation of patients. Specific benefits of RPMs could emerge in prevention and health promotion for specific populations—such as workers and students—and social settings.
It must be pointed out that the studies evaluated in this systematic review, although of medium-to-high quality, are not primary studies, and therefore, could be affected by significant bias. Furthermore, there is no evidence in the scientific literature for the evaluation of the effectiveness of RPMs on long-term patient outcomes.35 Therefore, the results from this study should be carefully applied by health workers, in order to minimise the risk of over or undertreatment.36
Scientific literature in the past 30 years has produced an abundance of evidence on other powerful health determinants, such as social relationships networks, stress, unemployment, education and income,37–40 however, none of these variables have been included in all the available predictive tools. Moreover, very few instruments considered lifestyle variables like smoking, alcohol, physical activity and drug use or addiction. A strictly biological perspective should be considered as a serious limitation in terms of forecasting and predicting the development of CVDs, CHDs, diabetes and hypertension. A new generation of predictive tools, conceptually developed around biological and non-biological determinants, could consistently ameliorate the assessment of risk and the detection of risk stratification groups.
Conclusions and future perspectives
The wide range of available studies that have tested RPMs for CVDs, CHDs, hypertension and diabetes compare almost overlapping tools (which often differ by only a single entry), does not really increase our knowledge of the issue; rather it merely increases uncertainty.
More precise evidence is available only for CVD prediction: the Framingham score, alone or in combination with the ABI, and QRISK score can be confirmed as the gold standard.
Further efforts should not be concentrated on creating new scores, but rather on performing external validation of the existing ones. Promising future possibilities could then involve testing risk scores on wider samples and on certain target populations, such as workers, with specific exposure risks and for which no robust scientific evidence is currently available. These individuals could definitely benefit from early detection of chronic disease, since the conditions are often worsened by occupational exposure and result in disability and absence from work. Benefits could be further improved by supplementing existing models with information on lifestyle, personal habits and family history,23 social network relationships, income, education and employment history.
Contributors Conceptualisation: FL and LP. Methodology: AV, DCM, FL and MM. Project Administration: FL. Visualisation: FL, DCM, MM, LP, AA, GA, GB, LM and AV. Writing—original draft: AA, AV, DCM, FL and MM. Writing—reviewing and editing: AV, DCM, FL, LP and MM. Validation: FL and LP.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Patient consent for publication Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement All data relevant to the study are included in the article or uploaded as online supplementary information.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.