Objective To examine the characteristics of frequent visitors (FVs) to emergency departments (EDs) and develop a predictive model to identify those with high risk of a future representations to ED among younger and general population (aged ≤70 years).
Design and setting A retrospective analysis of ED data targeting younger and general patients (aged ≤70 years) were collected between 1 January 2009 and 30 June 2016 from a public hospital in Australia.
Participants A total of 343 014 ED presentations were identified from 170 134 individual patients.
Main outcome measures Proportion of FVs (those attending four or more times annually), demographic characteristics (age, sex, indigenous and marital status), mode of separation (eg, admitted to ward), triage categories, time of arrival to ED, referral on departure and clinical conditions. Statistical estimates using a mixed-effects model to develop a risk predictive scoring system.
Results The FVs were characterised by young adulthood (32.53%) to late-middle (26.07%) aged patients with a higher proportion of indigenous (5.7%) and mental health-related presentations (10.92%). They were also more likely to arrive by ambulance (36.95%) and leave at own risk without completing their treatments (9.8%). They were also highly associated with socially disadvantage groups such as people who have been divorced, widowed or separated (12.81%). These findings were then used for the development of a predictive model to identify potential FVs. The performance of our derived risk predictive model was favourable with an area under the receiver operating characteristic (ie, C-statistic) of 65.7%.
Conclusion The development of a demographic and clinical profile of FVs coupled with the use of predictive model can highlight the gaps in interventions and identify new opportunities for better health outcome and planning.
- population analysis
- risk predictive modelling
- emergency department
- integrated care
- health planning
- health policy
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
- population analysis
- risk predictive modelling
- emergency department
- integrated care
- health planning
- health policy
Strengths and limitations of this study
Limited researches have been carried out with a focus on understanding frequent visitors (FVs) to emergency departments (EDs) or the prediction of those with high risk of a future representation to EDs among younger and general population.
This study examined the demographic patterns and clinical conditions of FVs to EDs and derived a risk predictive scoring system to target younger and general patients (aged ≤70 years), that is not restricted by certain chronic diseases or older age groups.
This study was strengthened by using all available data collected during ED presentations.
This is a retrospective population-based analysis and the risk predictive scoring system was derived using data from a single hospital.
Emergency departments (EDs) are designed to manage acute episodic medical diseases or injury. However, an important proportion of patients return to EDs frequently and unexpectedly. Representations to EDs have been directly associated with increased utilisation of inpatient care, multiple admissions and intensive care.1 Improving the management of these high cost patients is therefore important for better health outcome and healthcare planning. A core strategy to reduce potentially preventable representations is the development of a predictive model to identify those at risk of health deterioration and hospitalisation accurately, and therefore tailored integrated intervention can be provided before substantial avoidable representations have been incurred.2
Many predictive models have been developed to target and calibrate resources for interventions for at-risk patients, thereby reducing the overall cost of the interventions.2 These models used routinely collected data to understand the characteristics of those patients, and then identify people at high risk of a future admission or readmission.3 The development of such predictive models has been of great interest in recent years. However, these models were only specifically designed to predict patients with high risk of a future readmission rather than a future representation to EDs, often targeting only older patients with certain chronic diseases.4–8 Limited researches have been carried out with a focus on understanding of frequent visitors (FVs) to EDs or the prediction of a future representation to EDs. While both prediction of readmissions and representations can benefit patients and hospital together, specific characteristics that need to be considered in design of predictive modelling are greatly different from each other.
To understand the distinct characteristics of FVs to EDs, we first examined the demographic patterns and clinical conditions of FVs using a large scale (7.5 years) of hospital ED data. Using such large scale of ED data, we sought to understand and describe distinct patterns of the FVs. The characteristics we learnt were then used for the development of a predictive model, which we named as a Risk Predictive Scoring System for ED (RPSS-ED). The RPSS-ED is designed to target younger and general patients, who are not restricted by certain chronic diseases or older age groups. It is easy to use and relies only on a small number of variables that are easily collected from the electronic medical record system. We suggest that the development of a demographic and clinical profile of FVs coupled with the use of predictive model will highlight the gaps in interventions and identify new opportunities for better health outcome and planning.
Patient and public involvement
Patients and public were not involved in the development of the research questions or in the design of the study. Dissemination of the general results (no personal data) would be made on demand.
Study setting and population
This was a retrospective analysis of an ED data collected from the Nepean hospital in New South Wales (NSW), Australia. Nepean Blue Mountains Local Health District (NBMLHD) covers both urban and semi-rural areas, covering approximately 9179 km. The estimated resident population of NBMLHD in 2011 is 345 564, which includes an indigenous community (2.6%).9 The number of younger aged profiles and indigenous people have been steadily increasing in recent years. The number of ED presentations is projected to increase by 33% in 2022 along with increases in mental healthcare, rehabilitation and recovery, cancer care and renal dialysis.9 The increasing populations of both younger and elderly people introduce new and unique challenges in healthcare demands, planning and service delivery.
Data were extracted from Health Information Exchange system, which included all available ED records from the hospital from the period 1 January 2009 to 30 June 2016.
Recent studies suggest that the FVs to ED are highly associated with both elderly age groups and with younger age groups10 11 and discuss the importance of understanding the younger age groups.12 In addition, identification of FVs from older age group (aged >70 years) was relatively trivial as many of them were already FVs of the hospital, often suffering from certain chronic diseases. Such patients also tend to have distinct characteristics with a skewed distribution, introducing confounding (eg, survival) bias in the analysis. Identifying FVs from younger and general population (aged ≤70 years), however, is more challenging as their characteristics are more complex and heterogeneous. We therefore targeted younger and general patients aged ≤70 years in our analysis. We included all patients’ information (including any chronic conditions) if they visited the ED during this period. A total of 343 014 ED presentations were identified from 170 134 individual patients.
Our data contain demographic information including age, sex, marital status, indigenous status, patient postcode and county of birth. Other clinical variables including referral source (eg, self-referred, general practice and specialist), mode of arrival (eg, ambulance or others), presenting problem and mode of separation (eg, admitted to ward, not a critical care ward) were collected. Triage categories with scale of 1–5 (as defined by Australasian Triage Scale) were used. EDs diagnoses were categorised into subgroups based on headers of International Statistical Classification of Diseases and Related Health Problems, 10th revision of Australian modification.
We adopted the definition of FVs to be patients attending to EDs more than four times annually.13 14
We used SAS V.9.4 and Matlab 2017a for data analysis/manipulation and model development, respectively.
Population analysis of FVs to ED
The outcomes of FVs with different number of attendances to ED were compared with those of all visitors and non-FVs to identify the distinct patterns of FVs, consistent with other studies.10 11
Development of a risk predictive scoring system for ED
Descriptive statistics were used to identify a subset of candidate predictors for multivariable statistical analysis. We excluded candidate predictors with <10 expected events to avoid model instability; variables with variance inflation factors for multicollinearity exceeding a threshold of 10; those with unstructured keywords (eg, presenting problem). The continuous predictors are initially calibrated to find appropriate subgroups (eg, age bins) that maximise the differences or similarities of characteristics using maximum likelihood monotone coarse classifier algorithm.15 We used a mixed-effects model to understand the complexities (ie, correlations) within individual patient. We selected candidate predictors for multivariable logistic regression by testing the bivariable association of each fixed-effects predictor with outcomes at the 5% significance level. The patient-specific variations were measured using a random-effects intercept. We then applied the coefficients to the fixed-effects variables and translated into a point-scoring system. The points are scaled for simpler interpretation by setting a target level of 10 points with a target odds level of 2 and points-to-double-odds of 1. We divided the data into 10 groups from the training data, fitting the model on 90% (ie, derivation group) and using the model to predict remaining 10% of the data (ie, internal validation group). This process was repeated 10 times using each set (ie, 10-fold cross-validation). Variables included in our RPSS-ED are shown in table 1.
We also evaluated our model using a separate external validation dataset (ie, the dataset was not included during the model derivation stage). They were patients of the Nepean hospital who were receiving tailored integrated intervention. The clinicians manually selected patients 1) with multiple health and social care needs or 2) who have presented to ED 10 times in the past year. There were total 77 enrolled patients with a total of 3142 presentations since 2009–2016, with average of 40 presentations per patient. We computed the risk scores associated with each patient using the derived model.
Characteristics of FVs compared with visitors to EDs
The characteristics of FVs compared against those of all visitors to EDs from January 2009 to June 2016 are shown in table 2. Given the number of repeat representations in the database, the unit of analysis for clinical variables was EDs presentation rather than individual cases. Demographic information, such as age, sex and marital status, were analysed at individual level. The frequent representations (4+ per annum) were characterised by young adulthood (20–39 years) to late-middle (40–59 years) aged patients, accounting for 32.53% and 26.07%, respectively. Figure 1 shows the age-specific analysis based on annual frequency of attendance to EDs. The age group of 20–39 years was the highest with FVs who attended 9+ times per annum. As expected, the younger aged groups of 0–19 years had least contribution to FVs, having consistently smallest proportion compared with other age groups. Marital status has shown to be another important characteristic of frequent representations; they were highly related to people who have been widowed (2.16%), divorced (6.52%) or separated (4.13%). Frequent representations were also shown to be strongly associated with acuity presentations (52.34% in triage category 1, 2 or 3) and higher proportion of mental and behaviour disorders (10.92%) or endocrine and metabolic diseases (1.81%). Use of ambulance or police was also highly linked with frequent representations, accounting for 36.95%. Many FVs either did not wait to complete their treatment or leave the EDs at their own risk. FVs are highly associated with indigenous people with proportion of 5.7% compared with non-FVs of 3.08%. Similarly, consistent demographic patterns and clinical conditions were reported from the very frequent FVs (10+ per annum).
Table 3 shows the patterns of FVs based on the time of arrival to ED. The data were further split into six hourly group of the day, day of the week and seasonal groups. Compared with the characteristics of non-FVs, the very FVs are more likely to come to ED on Wednesday (14.24%), Thursday (14.4%) or Friday (14.28%) of the week or between 18:00 and 12:00 hours (36.47%). We also noted that the autumn was the busiest season with proportion of over 27% across all four different cohorts.
Risk predictive scoring system for EDs
Table 1 summarises the coefficients learnt for our selected nine variables using the mixed-effects model. The full list of variables with corresponding points are available on the online supplementary appendix. Total score associated with individual ED presentation can be calculated by summing all corresponding points of the variables. The lower total score indicates higher risk of representation to EDs. The two most important metrics in evaluating the performance of a predictive model are the accuracy of detecting FVs (true positive rate or sensitivity) and the accuracy of detecting non-FVs (false positive rate). The performance of the model from internal validation is therefore measured using receiver operating characteristics (ROC) curve and the Kolmogorov-Smirnov (K-S) plot (figure 2). Using 10-fold cross-validation, our derived model achieved area under the ROC curve (AUROC) of 65.7%. A bigger the size of AUROC curve indicates higher overall accuracy of the predictive model (an AUROC of 0.5 indicates no discrimination and so the higher the curve above the diagonal the better the predictive accuracy). Similarly, the K-S plot is a common statistic used to measure the predictive power of the scoring system. It shows the distribution of FVs and the distribution of non-FVs on the same plot. The key statistic of interest is to identify the maximum difference between these two distributions (sensitivity minus false positive). The score at which the maximum is achieved is used for our cut-off threshold score in our model. The maximum difference achieved in our experiment was 22.60% with cut-off total score of 16.22 (figure 2). At a risk score threshold of 16.22, our RPSS-ED had a sensitivity of 65.94% of correctly identifying FVs, and 43.35% were incorrectly classified as FVs (table 4). With a lower risk score threshold, for instance, 15.38, the rate of incorrectly identified FVs dropped to 5%, but the model had a lower accuracy of identifying FVs and non-FVs (sensitivity).
For the separate external validation, we computed the total risk scores based on current total enrolled patients’ corresponding points of variables. Using the cut-off threshold derived from the model (16.22), it achieved average positive sensitivity of 86.40% in detecting FVs to EDs. Since all enrolled patients were already FVs to the hospital, false positive rate was not computed.
This study has identified the demographic patterns and clinical conditions of FVs to EDs and developed a predictive model that is specifically designed to identify patients with high risk of a future representation to ED. To the best of our knowledge, this is the first work on focusing on the FVs to EDs, which is different from well-established existing works2 16 17 that mainly address prediction of patients with high risk of a future readmission.
Our findings indicate that FVs were characterised by young adulthood to late middle-aged patients with a higher proportion of indigenous population. Unlike FVs in older age group (60–70 years) who often have chronic illnesses such as diabetes and heart diseases, FVs in young adulthood (20–39 years) to late middle-aged (40–59 years) patients are highly associated with mental health-related diseases along with alcohol and drug-related diagnosis. Our findings are consistent with previous outcomes.10 18 Additionally, we also found that the FVs are more likely to arrive by ambulance and leave at own risk without completing their treatments. We also observed that FVs are highly associated with socially disadvantaged groups such as people who have been divorced, widowed or separated. This suggests that these groups may be the focus of certain interventions to reduce preventable representations. The identified patterns of FVs based on the time of arrival to ED also provide important implications in relation to ED management and strategic planning (eg, staff allocation and prediction of the number of beds required) to improve overall health outcome.19
We have developed a risk predictive scoring system only using a limited set of variables that were easily obtained from electronic medical record system, which allows integration to current medical systems. The performance of our predictive model was favourable with an AUROC of 65.7% (95% CI 0.655 to 0.659) and with a sensitivity of 65.94% for a risk score threshold of 16.22. More reliable results of 86.40% in detecting FVs to EDs were achieved from the separate external validation group (ie, current total enrolled patients). We attribute this to the size of external validation group which is much smaller than the internal validation group. For example, a systematic review of predictive risk models for readmission shows that AUROC (‘C-statistic’) are ranging from 50% to 72%.20 The use of RPSS-ED can potentially remove manual at-risk patient searches and therefore allows clinicians to focus more on patient care and service delivery.
The model has a few identified limitations. The performance of the model could have been improved by including more variables including social factors (eg, unemployment and previous trauma); however, these variables were not available. The model was developed based on a single hospital data; expanding the data across multiple hospitals will also help better understand diversified patterns of FVs and increase the performance of the model. Although our database is comprehensive and complete, some missing data or inconsistent coding at data entry can be problematic in model learning as it could lead to underprediction or overprediction. More consistent and accurate data are expected to improve the predictive modelling. The model cannot be applied to predict FVs in older age group (aged >70 years), since it was designed to predict FVs among younger and general population (aged≤70 years). Our model is designed to accommodate additional data (ie, other characteristics of FVs), which can be used to identify more emerging risks of representation. There are a lot of opportunities in improving the model by linking ED data with general practice data, inpatient data and outpatient data.
Contributors EA, JK, TB and CB were responsible for the conceptualisation of this project. KR, EA and CB were responsible for creation of the datasets in this project. EA and JK were responsible for statistical analysis and development of the predictive modelling. All authors contributed to the preparation of the manuscript.
Funding This research was supported by New South Wales (NSW) Health, Australia as part of the Integrated and Intensive Care Management Across Sector Collaboration (IICMASC) project. The resulting analyses and models are being used by the Nepean Blue Mountains Local Health District (NBMLHD).
Competing interests None declared.
Patient consent Not required.
Ethics approval Human Research Ethics Committees (HRECs) at the Nepean Blue Mountains Local Health District (NBMLHD), NSW, Australia.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.