Article Text
Abstract
Objective Predictive statistical models used in population stratification programmes are complex and usually difficult to interpret for primary care professionals. We designed FINGER (Forming and Identifying New Groups of Expected Risks), a new model based on clinical criteria, easy to understand and implement by physicians. Our aim was to assess the ability of FINGER to predict costs and correctly identify patients with high resource use in the following year.
Design Cross-sectional study with a 2-year follow-up.
Setting The Basque National Health System.
Participants All the residents in the Basque Country (Spain) ≥14 years of age covered by the public healthcare service (n=1 946 884).
Methods We developed an algorithm classifying diagnoses of long-term health problems into 27 chronic disease groups. The database was randomly divided into two data sets. With the calibration sample, we calculated a score for each chronic disease group and other variables (age, sex, inpatient admissions, emergency department visits and chronic dialysis). Each individual obtained a FINGER score for the year by summing their characteristics’ scores. With the validation sample, we constructed regression models with the FINGER score for the first 12 months as the only explanatory variable.
Results The annual FINGER scores obtained by patients ranged from 0 to 57 points, with a mean of 2.06. The coefficient of determination for healthcare costs was 0.188 and the area under the receiver operating characteristic curve was 0.838 for identifying patients with high costs (>95th percentile); 0.875 for extremely high costs (>99th percentile); 0.802 for unscheduled admissions; 0.861 for prolonged hospitalisation (>15 days); and 0.896 for death.
Conclusion FINGER presents a predictive power for high risks fairly close to other classification systems. Its simple and transparent architecture allows for immediate calculation by clinicians. Being easy to interpret, it might be considered for implementation in regions involved in population stratification programmes.
- risk assessment
- assessment of healthcare needs
- target population
- patient care management
- multiple chronic conditions
- primary health care
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
- risk assessment
- assessment of healthcare needs
- target population
- patient care management
- multiple chronic conditions
- primary health care
Strengths and limitations of this study
We propose a new population stratification system to identify high-risk patients based on clinical criteria.
We analysed data for an entire healthcare system, providing near universal care for the population of a defined geographical area and integrating data from primary healthcare, hospitals and outpatient specialised care.
In the search of becoming a tool for real-world implementation, our system only contains variables routinely recorded in electronic health records for all patients (ie, diagnoses, demographics, previous inpatient admissions and emergency department visits).
For such reason, relevant factors for which there is not usually consistent information in medical records and administrative databases (psychosocial and socioeconomic variables; lifestyle and risk behaviours; self-perceived health) were not taken into account.
Introduction
In recent decades, the type of patients served by healthcare organisations has evolved. Life expectancy increases, the development of more effective treatments or variations in lifestyles have contributed to change the profile of health problems.1 Currently, chronic diseases and multimorbidity (ie, the simultaneous presence of several health problems in the same person) represent the most prevalent epidemiological pattern at the population level.2–4
Caring for chronic illnesses and for patients with complex needs is challenging.5 The healthcare provided to such patients is often poorly coordinated, and this has a negative impact on quality of care and increases healthcare costs.5–7 Furthermore, a small number of patients with multimorbidity require so many repeat admissions to hospital and other costly treatments that the associated costs absorb most of the budget of healthcare organisations.8 Because individuals have different levels of morbidity, they require different types of healthcare. Hence, a health organisation needs to provide the right care to the right patient to be successful. Examples of those designs are the Chronic Care Model9 10 or the Kaiser Permanente Pyramid Model.11
One of the main challenges in matching health provision to need is the development of information systems able to identify groups of patients with similar level of morbidity, risk of impairment and healthcare needs.12 The establishment of homogeneous groups of patients is the starting point of risk adjustment systems.13–15 Such systems were originally developed in the USA. Their initial purpose was for managing funding and contracting of services, although they have other applications such as for fair comparisons of the performance of providers and population stratification. Risk adjustment models require access to explanatory variables (data on clinical and demographic characteristics, or previous healthcare costs) for the entire population. They use statistical models, and provide predictions regarding future healthcare resource use, hospitalisation or other variables of interest.16 17 Nowadays, the use of risk adjustment is diverse. In the USA, it is a fundamental tool in the financing of federal health insurance programmes as Medicare. Risk adjustment is also used for the reimbursement to health insurance carriers in countries with health insurance system such as the Netherlands, Germany, Switzerland or Belgium.18
However, there are other countries where the use of risk adjustment is rare. Spain, with a National Health System characterised by public financing and a high proportion of public provision, is one of them, although different studies confirm the benefits derived from this methodology19–21 The main barrier explaining the lack of use of risk adjustment in Spain is lack of acceptance by clinicians. They tend not to trust such complex statistical models because they find them difficult to interpret.22 As a result, in many countries23 including Spain,24 the identification of high-risk populations is provided by both risk scores from statistical models and judgements from clinicians. Such double routes for recruitment produces misunderstanding in physicians, results in inclusion in programmes of patients with heterogeneous needs and hampers the evaluation of interventions.
This paper proposes a new population stratification system that balances the tradeoff between predictability and simplicity. We sacrifice predictive power so as to gain in simplicity and acceptance. Our proposal therefore is not presented as a reimbursement system for insurance carriers. It is based on clinical criteria, and is easy for healthcare professionals to understand and apply. We called this system FINGER (Forming and Identifying New Groups of Expected Risk, or from the Spanish, Formación e Identificación de Nuevos Grupos de Estratificación de Riesgo) because it points to high-risk patients and can be calculated immediately by health professionals following some simple rules with the information of the presence of chronic conditions. The objective of this study is to assess the validity of statistical models based on FINGER to predict healthcare resources use, and determine its ability to prospectively identify individuals at high risk of hospitalisation or health costs.
Methods
Data
This is a cross-sectional study. All individuals covered by the Basque public health system on 1 September 2008 comprise our population. However, we excluded the paediatric population (individuals under 14 years old) because our main focus is the design of a risk stratification model based on the presence of chronic conditions and our goal is to identify people with the greatest healthcare needs. A total of 28 151 people did not complete the second follow-up year due to death (n=18 547), transfer or other causes (n=9604). Those citizens in the study population who died during the second year were included, whereas those who withdrew for other reasons were not. Hence, our total sample consists of 1 946 884 individuals.
The study period corresponds to two consecutive 12-month intervals. First year data (1 September 2007–31 August 2008) establish the explanatory variables. Second year data (1 September 2008–31 August 2009) validate our estimations.
Data were retrieved from the different available sources of information: primary care electronic health records, the minimum basic data set from hospital discharge reports and electronic records from day hospitals and from visits to emergency departments and specialised care. This way, we obtain demographic (age and sex) and clinical information (International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes for the diagnoses made), as well as the history of all contacts that our population had with the different levels of provision at the health system, and their healthcare costs. The database used has been described in more detail in previous publications.25
Patient and public involvement
Patients and public were not involved in this study.
Patient classification
FINGER is a patient classification system that provides an individual risk for each person. We first collapsed all ICD-9-CM codes of chronic pathologies into 27 chronic disease groups (CDGs) (table 1). Then we assigned one relative weight to every CDG, based on our linear regression estimations for healthcare cost in the following year, setting a maximum score of 10 for a CDG. Each patient obtained his chronic morbidity score by adding the scores of all diagnosed CDGs. For each patient, any given CDG did not count more than once; that is, multiple diagnoses corresponding to the same CDG did not change an individual’s score. Likewise, we added weights for age groups, sex and previous hospital utilisation to obtain the final score for each patient also based on our linear regression estimations (table 2). A more complete description of FINGER and its design is included in the online supplementary appendix.
Supplementary file 1
Distribution of the population of the Basque Country across chronic disease groups (CDGs)*
Score for chronic disease groups (CDGs), age groups, sex and previous hospital utilisation
Study variables and statistical models
To avoid overadjustment problems, we randomly divided the database into two subsets: the first for designing and calibrating the FINGER system, and the second exclusively for validation. A comparison of the characteristics of both subpopulations of patients are included in the online supplementary appendix (tables S1 and S2). With the validation sample, we estimated different regression models (linear and logistic) to obtain the risk scores for each individual. The dependent variable for the linear regression was healthcare costs of individuals in year 2, while for logistic regressions, we used the following dependent variables also for year 2:
High use of resources (belonging to the top 5% individuals with highest healthcare costs).
Extremely high use of resources (belonging to the top 1% individuals with highest healthcare costs).
Emergency hospitalisations, excluding admissions for obstetric or traumatic conditions (since we aim to identify individuals who might benefit from case management programmes).
Prolonged hospital stay (sum of hospital bed days for causes other than obstetric and traumatic conditions >11 days).
Very prolonged hospital stay (sum of hospital bed days for causes other than obstetric and traumatic conditions >15 days).
Death.
The analyses were repeated four times. In each case, the only independent variable was the score that summed up the scores for the following sets of independent variables:
Age and sex.
Diagnoses (CDG categories).
Age, sex and diagnoses combined.
Age, sex, diagnoses and resource use combined.
To assess and compare the models, we calculate the coefficient of determination (R2) for linear regressions and the area under the receiver operating characteristic curve (AUC) for logistic regressions.
Results
Descriptive statistics
The FINGER scores obtained by patients ranged between 0 and 57, with a mean of 2.06. As expected, the distribution was markedly skewed to the left: 33% of patients scored zero and 91% of patients scored no more than five points, while only 5% of patients obtained scores of 8 or more and just 1% obtained scores of 14 or more (figure 1).
Cumulative percentage of population according to their FINGER (Forming and Identifying New Groups of Expected Risks) score.
The average healthcare expenditure on patients in the second year was €1126, ranging from €0 to €1 55 140. A total of 2 05 408 individuals (21.10%) incurred no health costs in this period, that is, they were non-users.
Regarding hospitalisations for causes other than obstetric and traumatic conditions in the 12 months of the study, 3.48% of the population had at least one admission, while 1.06% were admitted for at least 12 days and 0.73% for more than 15 days. Overall, 0.96% of patients died.
The number and percentages of patients that presented such events according to their FINGER scores are summed up in table 3.
Distribution of patients of the validation sample according to their FINGER scores
Validation of the stratification system
The results of the linear regression analysis to predict resource use the year after patient classification are shown in table 4. The model using only the demographic variables explained 7% of the variability in healthcare costs while the model using only the chronic morbidity score based on the CDGs yielded an R2 of 0.143. The model combining the scores for the demographic variables and morbidity had a R2 of 0.155. Finally, the complete model, with the sum of the scores for the demographic variables, morbidity and previous resource use, yielded an R2 of 0.188.
Capacity of FINGER (Forming and Identifying New Groups of Expected Risks) to predict healthcare use in the year after patient classification
Table 5 presents the results of logistic estimations predicting resource use, hospitalisation or death. Age and sex models presented AUC values between 0.74 and 0.79, while the most complete model combining demographic, morbidity and previous use information obtained AUC values always greater than 0.80, with particularly good results for identifying extreme cases: 0.88 for identifying top 1% individuals with highest healthcare costs and 0.86 for individuals with length of stays at hospitals greater than 15 days. Regarding the prediction of death, using exclusively demographic variables produced notably good results (AUC=0.87), even better than only morbidity (AUC=0.78). However, combining demographic variables and morbidity or all these with previous resource use achieved AUC values close to 0.9.
Results of the FINGER-based predictive models: area under the receiver operating characteristics curve (AUC) (CI 95%)
Discussion
Main results and comparison with other predictive systems
This study describes the development and validation of a new population stratification system. Our model, FINGER, identifies individuals who will require a large amount of healthcare, or experience unexpected events such as emergency visits, hospitalisation or death. FINGER is easy to use and to understand and does not require complex statistical calculations, being exclusively based on data from health records. While age and sex predict 7% of the variability in future use of resources in the linear model, our morbidity score predicts 14% and the complete FINGER model predicts 19%. With respect to logistic models, we assess their ability to identify high-risk individuals through the AUC. An AUC of 0.5 indicates no predictive power at all (no better than chance). Differently, a value of 1 corresponds to optimal sensitivity and specificity. A model predictability is usually considered to be acceptable if AUC lies between 0.7 and 0.8, and good if it is above 0.8.26 Hence, FINGER has good power to prospectively identify individuals who will require a high or extreme resource use (0.838 and 0.875), emergency hospital admission (0.802), prolonged hospitalisation (0.861) or those who will die (0.896). Comparing the results between the FINGER models, the addition of the previous resource use to a model based on age, sex and diagnoses only produced small differences. However, it is known that the AUC is harder to increase when the baseline model performs well.27 In our case, we considered that such improvement, although modest, is worthwhile because the collection of such predictive variable does not involve difficulties.
A previous research25 used the same database to predict healthcare cost with highly sophisticated and recognised case-mix systems: Adjusted Clinical Groups (ACGs),28 Clinical Risk Groups (CRGs)29 and Diagnostic Cost Groups (DCGs).30 They obtained coefficients of determination of 0.23, 0.22 and 0.25, respectively, with the best statistical models (including as explanatory variables prescriptions, previous healthcare cost percentile, age, sex and diagnoses). These results are also similar to those obtained by other authors, in other healthcare systems.31
Assessing the ability of models to identify high-risk individuals, the differences of FINGER with the above-mentioned case-mix systems are even smaller, although comparisons are partial. Due to restrictions in the use of databases and licensed software, it is only possible for us to access to published results.25 32 According to that, their AUC values ranged from 0.848 to 0.868 for high costs, 0.869 to 0.899 for very high costs, 0.809 for hospitalisation and 0.870 for prolonged hospital stays.
This study employed data for an entire healthcare system, providing near-universal care for the population of a defined geographical area and integrating data from primary healthcare, hospitals and outpatient specialised care. However, our analyses are based on information registered some years ago. In this sense, the changes in clinical practice or health services management occurred in recent years could somehow affect the generalisation of the results to the present moment.
FINGER presents some limitations, some of which are common to other risk adjustment systems. First, some factors that are known to have impact on the need for healthcare or outcomes have not been included in the model; these include psychosocial and socioeconomic variables, as well as lifestyle and risk behaviours and self-perceived health.33 34 Usually, however, for most of such indicators, there is lack of consistent information in administrative databases at the present time.35 With the aim of developing a tool for real-world implementation, FINGER only contains variables routinely recorded in electronic health records for all patients. Second, to estimate the health status of individuals, FINGER only takes into account diseases and other health problems for which patients have demanded care from the public health system. Hence, unperceived needs could be not taken into account. Further, although our health system provides almost universal coverage, some social groups may encounter barriers to access. Thirdly, it is known that the information recorded in electronic health records may be inaccurate.36 Additionally, FINGER classifies individuals’ health problems into only 27 disease groups, sometimes being difficult to identify patients with specific conditions. Finally, highly predictive variables such as previous healthcare cost37 38 were excluded because they are influenced by factors different to patient needs, such as the efficiency of healthcare provision.
Nevertheless, we attempted to provide the simplest algorithm so that family doctors may use the model as an assessment scale. Hence, we reduced FINGER to a reasonably small number of health problems, at the expense of getting a greater level of granularity. Notably, ACGs, CRGs and DCGs provide a great amount of information and describe the morbidity of a population at a very disaggregated level, predicting somewhat better than FINGER in linear models and similarly well in logistic models. Since they detect health problems of individuals from their diagnoses and prescriptions, they overcome some of the limitations of using administrative databases.36 However, their classification algorithms are complicated and require the use of proprietary software. Often, the predictions are obtained from local calibration based on statistical regression models, and this modelling requires support from experts and it is beyond the abilities of clinicians. In most European health services, this process is only performed every several months,39 so there could be discrepancies between the present situation of the patient and his/her latest estimation of risk. In contrast, the open architecture of our FINGER system is very simple. It is based on fewer variables, and obtains the patient individual risk by a simple sum of scores. Hence, even if designed to classify the entire population of a given geographical area from administrative databases, its estimation may be performed or updated directly by family doctors during patient visits with data from health records without needing the use of any software.
Potential applications in healthcare systems
FINGER, as other case-mix systems, identifies patients who may be candidates for specific interventions. It discriminates particularly well individuals at high risk of future hospitalisation, prolonged hospital stays and extreme healthcare resource use. Hence, it allows clinicians to design specific programmes for certain diseases matching patients’ needs. Additionally, this system could also be used for other purposes, such as for describing the burden of morbidity and of certain health problems in populations in specific geographical areas.
The choice of a stratification model looks at both its predictive power and the level of granularity at which it describes population health needs. Nonetheless, other characteristics should also be considered. Currently used case-mix systems have demonstrated their statistical validity in many countries and also in our setting. However, they are difficult to introduce in the context of a National Health System such as the Spanish, and in particular in the regional Basque Health System. They induce reluctance among primary care clinicians who do not see the clinical benefits of their use. Physicians demand a simpler model with transparent architecture, easy to calculate and interpret to overcome the barriers in its implementation.22 We understand FINGER fills this gap. We accept that it is not valid to be used for calculating reimbursement because it slightly sacrifices predictive power compared with other systems, but it still represents an attractive option for applying population stratification programmes in the context of a National Health System.
Unanswered questions and future research
Nowadays, numerous efforts to transform models of health delivery are being implemented all over the world. A fundamental component of care management programmes is the targeting and selection of populations for such interventions. The new system for patient classification developed, FINGER, has shown to be able to predict healthcare costs and identify individuals who in the following 12 months will require a large amount of healthcare resources, need unscheduled admission to hospital or remain admitted for long periods of time, as well as be at risk of death. It has a straightforward design and is mainly based on the diagnoses of health problems for which patients have sought medical attention. We consider that it is intuitive, easy to understand and suitable for primary healthcare professionals. In relation to this, there is a need for future studies analysing the clinicians’ perceptions and opinions. Furthermore, our results should be tested in other settings or specific population groups (eg, patients with multimorbidity or with specific diseases).
Acknowledgments
The authors thank The Basque Foundation for Health Innovation and Research (BIOEF) for support in the translation of the first draft of the manuscript.
References
Footnotes
Contributors JFO, JJA and MG-G contributed in the design of the study. JFO performed the validation of the databases. JFO and AG-A developed the classification system. AG-A was responsible for statistical analyses. MG-G and JFO wrote the draft of the manuscript. All the authors participated in the interpretation of the data; they also critically reviewed and gave final approval to the manuscript.
Funding Manuel García-Goñi thanks the Ramon Areces Foundation for financial support under the research project ‘Envejecimiento y sistema sanitario y social. El gasto público y sus efectos en igualdad, dependencia y aseguramiento en España’. All authors thank this project for funding publishing charges.
Competing interests None declared.
Patient consent Not required.
Ethics approval The Clinical Research Ethics Committee of the Basque Country approved this study (PI2015128).
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.