Article Text


Predicting admissions and time spent in hospital over a decade in a population-based record linkage study: the EPIC-Norfolk cohort
  1. Robert Luben1,
  2. Shabina Hayat1,
  3. Nicolas Wareham2,
  4. K T Khaw1
  1. 1Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
  2. 2Medical Research Council Epidemiology Unit, University of Cambridge, Cambridge, UK
  1. Correspondence to Dr Robert Luben; robert.luben{at}


Objective To quantify hospital use in a general population over 10 years follow-up and to examine related factors in a general population-based cohort.

Design A prospective population-based study of men and women.

Setting Norfolk, UK.

Participants 11 228 men and 13 786 women aged 40–79 years in 1993–1997 followed between 1999 and 2009.

Main outcomes measures Number of hospital admissions and total bed days for individuals over a 10-year follow-up period identified using record linkage; five categories for admissions (from zero to highest ≥7) and hospital bed days (from zero to highest ≥20 nights).

Results Over a period of 10 years, 18 179 (72.7%) study participants had at least one admission to hospital, 13.8% with 7 or more admissions and 19.9% with 20 or more nights in hospital. In logistic regression models with outcome ≥7 admissions, low education level OR 1.14 (1.05 to 1.24), age OR per 10-year increase 1.75 (1.67 to 1.82), male sex OR 1.32 (1.22 to 1.42), manual social class 1.22 (1.13 to 1.32), current cigarette smoker OR 1.53 (1.37 to 1.71) and body mass index >30 kg/m² OR 1.41 (1.28 to 1.56) all independently predicted the outcome with p<0.0001. Results were similar for those with ≥20 hospital bed days. A risk score constructed using male sex, manual social class, no educational qualifications; current smoker and body mass index >30 kg/m², estimated percentages of the cohort in the categories of admission numbers and hospital bed days in stratified age bands with twofold to threefold differences in future hospital use between those with high-risk and low-risk scores.

Conclusions The future probability of cumulative hospital admissions and bed days appears independently related to a range of simple demographic and behavioural indicators. The strongest of these is increasing age with high body mass index and smoking having similar magnitudes for predicting risk of future hospital usage.


This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See:

Statistics from

Strengths and limitations of this study

  • Prospective cohort design, with a population of community-dwelling participants enabling us to examine hospital activity with clearly defined population denominators.

  • Large study size of middle aged and older men and women with a long follow-up time and detailed measurements of demographic and behavioural indicators.

  • It was not possible for us to infer causal links between the lifestyle factors and hospital admissions.

  • We were not able to examine non-National Health Service hospitals and clinics where study participants paid for treatment.


In the UK, the number of men and women over 65 years of age was 10.8 million in 2012 and is projected to increase to 17.8 million by 2037 with those over-85s doubling in number to 3.6 million.1 Two-thirds of people admitted to hospital are over 65 years old with those over 85 years accounting for 25% of bed days.2 Though increasing age is associated with increased health service usage, other factors may help identify those at greatest risk of admission. Most studies examining hospital activity start from those hospitalised but are limited with respect to population denominators;3–7 even those that use general practice record linkage studies only include people who attended general practices while population-based studies that have measured factors prospectively prior to admission are limited.8–13

In this study we examined the relationship between simple and easily measurable demographic and behavioural factors to predict in a general population cohort resident in Norfolk, the future risk of use of National Health Service (NHS) hospitals over a 10-year period from 1999 to 2009, a period of relative stability for the NHS under Primary Care Trusts.


The European Prospective Investigation into Cancer, Norfolk (EPIC-Norfolk), is a cohort of men and women aged 40–79 years living in Norfolk recruited from participating general practitioner practices between 1993 and 1998.14 ,15

Study design

A total of 25 639 participants completed a lifestyle questionnaire on recruitment and attended a clinic where weight and height and other measurements were made by trained nurses using standard protocols. Body mass index (BMI) was calculated as weight in kilograms divided by height² in square metres. The lifestyle questionnaire included questions relating to current and former employment.

Occupational social class was defined according to the Registrar General’s classification. Non-manual occupations were represented by codes I (professional), II (managerial and technical), IIIa (non-manual skilled) occupations, while manual occupations were represented by codes IIIb (manual skilled), IV (partly skilled) and V (unskilled) occupations.

Educational attainment was established using the question ‘Do you have any of the following qualifications’ followed by a list of common UK qualifications. Participants were categorised according to the highest qualification attained in three groups: those with no formal qualifications; those with formal qualifications usually associated with a school age between 16 and 18 years; and those with degree level qualifications.

Smoking status was derived from two questions each of which could be answered as yes or no: ‘Have you ever smoked as much as one cigarette a day for as long as a year?’ and for those who answered yes to the first question ‘Do you smoke cigarettes now?’

Record linkage

Between 1999 and 2009, cohort participants were linked to hospital records using their unique NHS numbers.16 We used databases maintained by the East Norfolk Primary Health Care trust (PCT) an approach with the advantage that all hospital activity for Norfolk residents was captured wherever they were treated in England and Wales. The majority (95%) of admissions were to the Norfolk and Norwich University Hospitals NHS Foundation Trust (formerly Norfolk and Norwich Hospital), the remainder being admissions to other hospitals in Norfolk and neighbouring counties, community and mental healthcare trusts admissions, surgery performed in general practices and a small number of emergency admission elsewhere in the country. The PCT changed their computer systems several times over the period of study and although the data were collected from the same sources, different database systems such as Health Interlock, East Norfolk Core minimum data set (ENCORE) and others were used for linkage at different times.

Participants were also followed for mortality through linkage with the Office for National Statistics. Data from hospital records were available for inpatient episodes and outpatient visits. Records of inpatient data were organised with one row corresponding to one hospital episode. Typically patients would have several episodes for each admission. Dates of the start and end of each episode and the admission and discharge dates were included.

Each episode also had associated with it one of more International Classification of Disease V.10 (ICD10) diagnosis code and one or more OPCS Classification of Interventions and Procedures V.4 (OPCS4) procedure code. Using these data it was possible to build a fairly detailed picture of a person’s hospital stay. Outpatient data were more limited in scope, restricted to dates of a clinic visit and the specialty concerned.

Time in hospital was calculated using admission and discharge dates by summing the time between admission and discharge for each person. We used the formula one plus (discharge date minus admission date) to ensure that time in hospital for those admitted and discharged on the same day (day cases) was considered. Hospital admissions were also calculated using admission, discharge, episode start and end dates. To avoid counting immediate readmissions, where one hospital stay followed on rapidly from another, contiguous admissions were merged and counted as a single admission.

Over the 10 years of follow-up, the numbers of admissions were categorised as 0, 1, 2–3, 4–6 and ≥7. Bed days were also classified into five categories, none, day case, 1–4 nights, 5–19 nights and ≥20 nights. Three main outcomes were used: hospital admissions ≥7; Bed days ≥20 nights; and no admissions; and compared respectively with those not in those categories.

Statistical analyses

We examined the distribution of hospital admissions by baseline descriptive data. ORs for each of the main outcomes: ≥7 hospital admissions; bed days ≥20 and no hospital admissions were calculated using unmatched logistic regression with independent variables age, smoking, BMI >30, manual social class and no educational qualifications. We then created a summary risk score, defined as the sum of five baseline risk factors dichotomised as binary categories each coded one or zero. The categories, each contributing one point were male sex, manual social class, low education level (those with no qualifications), current smoker and BMI >30 kg/m². Those with scores four and five were combined into a single category as the number with score equal to five was very low.

We used logistic regression rather than survival analysis to prevent the censoring of participants who had died, since we wished to make no distinction between non-attendance of hospital due to good health and non-attendance because of death. The number of missing values were: 53 BMI, 218 smoking status, 545 social class, 18 level of education. We examined mortality rates in the cohort by risk score stratified by age over three periods of follow-up time: 1993–1998, 1999–2004; and 1999–2009 to explore the possibility of differential mortality and therefore attrition of the population in the different risk groups which might explain some of the patterns observed. In addition, to explore the possibility of the effect of participant migration during the period under examination a sensitivity analysis was conducted on the subset of the cohort whose postcode area was Norfolk (‘NR’) at both the start and end of the period. All analyses were performed using the R statistical language (R Foundation for Statistical Computing, Vienna, Austria V.3.1.2 with packages knitr, Gmisc and IRanges) and Stata statistical software V.12 (Stata Corporation, College Station, Texas, USA).


For the current analyses, we excluded the 625 men and women from the baseline cohort who died before 1999 leaving 11 228 men and 13 786 women. Over a period of 10 years, between 1999 and 2009, 8300 (72.7%) male and 9879 (72.7%) female study participants were admitted to hospital. In total 92% of these admissions were to the Norfolk and Norwich Hospital. Descriptive characteristics of the cohort are shown in table 1. Table 2 shows the distribution of characteristics by hospital admission category. The proportion of study participants with no hospital admissions decreased monotonically across categories of age band, smoking status (never, former, current), six levels of social class, level of education (high, medium, low) and four categories of BMI while the proportion shows a monotonic increase for the same variables in the highest categories of admission. Table 3 shows similar analyses and results for increasing categories of and bed day numbers.

Table 1

Descriptive characteristics of men and women in the EPIC-Norfolk cohort 1993–1997 and hospital admission 1999–2009

Table 2

Distribution of characteristics of 25 014 men and women in 1993–1997 by category of number of hospital admissions 1999–2009

Table 3

Distribution of characteristics of 25 014 men and women in 1993–1997 by category of total hospital days 1999–2009

Table 4 shows the independent relationships using logistic modelling between demographic and behavioural factors in relation to hospital admissions. High numbers of admissions and bed days were positively associated with male sex, age, manual social class, smoking and high BMI while no hospital admissions were inversely associated with these factors. The strongest risk factors for more than 7 admissions were age OR 1.75 (1.67 to 1.82) per 10-year increase, being a current cigarette smoker OR 1.53 (1.37 to 1.71) and BMI ≥30 kg/m² OR 1.41 (1.28 to 1.56). Age was the strongest risk factor for high bed day usage >20 days OR 2.54 (2.44 to 2.65) per 10 years increase in age. Current smoking OR 1.59 (1.44 to 1.77) and BMI ≥30 kg/m² OR 1.54 (1.41 to 1.68) were also important risk factors.

Table 4

Multivariable logistic regression of risk factors for no hospital admissions, ≥7 hospital admissions and ≥20 days of hospital stay from 1999 to 2009 in 25 014 men and women aged 40–79 years 1993–1997

The demographic and lifestyle factors were used to construct a risk score. Table 5 shows that an increase in the absolute rate of admissions and bed days across score categories was observed in all but the oldest age category. Conversely, the percentage not admitted to hospital over 10 years decreased over increasing risk score categories. In the participants <75 years similar increases in the absolute rates of admissions and bed days were also observed with increasing risk score apart from the highest score categories, though the gradient attenuated with increasing age.

Table 5

Absolute percent with no hospital admissions, ≥7 hospital admissions or >20 hospital nights during follow-up 1999–2009 in men and women 40–79 years in 1993–1997

Table 6 shows mortality rates over different time periods by age group and risk score. There was a mortality gradient by increasing risk score and the gradient was steeper for the shorter follow-up time. Sensitivity analyses (see online supplementary table) based only on individuals who were at the same postcode throughout the whole duration of this study showed similar results.

Table 6

Mortality rates by risk score and age group during before 1993–1998, 1999–2003 and 1999–2009


Our data report hospital usage patterns measured either by the number of hospital admissions or by total bed days, over a 10-year follow-up period in a population of middle aged and older men and women in the UK. We observed that age, male sex, manual social class low education level, current smoking and BMI >30 kg/m² independently predicted multiple admissions and extended time in hospital. A simple five-point risk score constructed using male sex, manual social class, no educational qualifications, current smoking and BMI >30 kg/m², estimated percentages of the cohort in the categories of admission numbers and hospital bed days in stratified age bands with twofold to threefold differences in future hospital use between those with high and low risk scores.

More than half of women under 55 years of age with risk score of zero will expect one or more hospital admission over the next decade but only 5% would have more than 7 admissions or more than 20 nights in hospital. Up to the age of 75 years the number of hospital admissions one might expect increases with the risk score. For those aged 55–65 years only 13% might expect to spend 20 nights in hospital over the next 10 years but this increased to 30% for those with a risk score of four or five. Eighty-seven per cent of men and women over 75 years would expect to be admitted to hospital on one or more occasions over 10 years irrespective of their risk score.

While the trend for increasing hospital use with risk score was not consistent in the oldest age group >75 with the highest risk score, numbers in this group were not large. Possible explanations include substantial differential mortality early on in follow-up resulting in attrition as observed in table 6 so that fewer individuals were at risk of hospital admissions and bed day use over the full 10-year follow-up period.

Comparison with other studies

Most studies examining hospital usage in the UK are based on hospital data but are limited in their capacity to estimate accurately denominator populations or to assess characteristics prior to hospitalisation and how they may relate to relative or absolute risk of hospital usage prospectively.

The EPIC-Norfolk cohort was recruited from the general population resident in Norfolk and unlike hospital-based studies is able to compare characteristics of hospital attenders and those who did not need to use those services. The period under examination approximately coincides with administrative control by Primary Health Trusts (PCT, 2002–2013) with hospital usage free at the point of delivery under the UK NHS.

Health service usage for study participants resident in the Norfolk area is the responsibility of the East Norfolk PCT irrespective of where in the country the usage occurred. Linkage to the PCT has the advantage of capturing episodes at any UK hospital, not just those in the area. Our study included data from several UK hospitals although the large majority were from Norfolk hospitals. We were able to estimate the probability of hospital admissions and total bed days over a 10-year period according and how they varied according to a range of simple and easily measured demographic and behavioural characteristics generally available in general practice.

A limitation in our study is the lack of information about non-NHS hospital and clinics where study participants paid for treatment. This would include common cosmetic procedures such as the removal of varicose veins and other procedures offered as a private service that may be restricted or not available on the NHS. Data on treatment in private hospitals or clinics were not available to us. It is possible that some of the associations we observed between those in higher social class groups and lower hospital usage are explained by private treatment. However, most serious long-term conditions are treated in NHS hospitals. The differences by sex and BMI we observed were independent of social class and education. It is also possible that individuals may have differentially moved away during follow-up. However, the sensitivity analyses (see online supplementary table) based only on those individuals living in the same post code observed essentially similar results. We have not attempted to examine the reason for admission and simply examined and restricted ourselves to the number of occasions when hospital services were used. The most common reasons for admission were related to diseases of the circulatory system (essential hypertension and chronic ischaemic heart disease being the most common) and diseases of the digestive system (the most common being gastritis, diaphragmatic hernia and diverticular disease). We have also not looked at the survival of those who did or did not use hospital services. Future exploration of these areas will help give us a clearer and more detailed understanding.

While it is not possible to infer causal links between the lifestyle factors and hospital admissions, differences in social class and education may reflect real differences in health status need or demand. Alternatively, thresholds for admission may vary.

In this study, we have identified a range of simple demographic and behavioural indicators that are related to the future probability of cumulative hospital admissions and bed days. The strongest of these are increasing age and male sex. However, the modifiable factors we examined are all strongly associated with hospital usage. Current cigarette smokers were 59% more likely to have 20 of more nights in hospital while those with BMI >30 kg/m² are 54% more likely, indicating an important role of potentially modifiable factors for hospital usage. These and the other simple indicators we have examined are easy to collect and may assist healthcare providers and those planning services to predict future hospital use.


The authors thank all study participants, general practitioners and the EPIC-Norfolk study team for their contribution.


View Abstract
  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Contributors RL managed the data collection, linked cohort data to HES records, cleaned and analysed the data, and drafted and revised the paper. He is guarantor. KTK conceived the research, analysed the data and drafted, and revised the paper. SH coordinated the research and revised the draft paper. NW conceived the research and revised the draft paper.

  • Funding The design and conduct of the EPIC-Norfolk study and collection and management of the data was supported by programme grants from the Medical Research Council UK (G9502233, G0401527) and Cancer Research UK (C864/A8257, C864/A2883). The sponsors had no role in any of the following: study design, data collection, data analysis, interpretation of data, writing of the article, decision to submit it for publication. All authors are independent from funders and sponsors and had access to all the data.

  • Competing interests None declared.

  • Ethics approval The study has ethics committee approval from Norfolk Ethics Committee (Rec Ref: 98CN01) and all participants gave informed signed consent for the examination of medical records.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement No additional data are available.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.