Article Text

This article has a correction. Please see:

Does spatial proximity drive norovirus transmission during outbreaks in hospitals?
  1. John P Harris1,2,
  2. Ben A Lopman1,
  3. Ben S Cooper3,4,
  4. Sarah J O'Brien4
  1. 1Gastrointestinal Emerging and Zoonotic Diseases Department, Health Protection Agency, London, England
  2. 2University of Liverpool Institute of Infection and Global Health and National Consortium for Zoonosis Research
  3. 3Department of Clinical Medicine, Centre for Clinical Vaccinology and Tropical Medicine, Nuffield University of Oxford, Oxford, UK
  4. 4Mahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
  1. Correspondence to John P Harris; john.harris{at}


Objective To assess the role of spatial proximity, defined as patients sharing bays, in the spread of norovirus during outbreaks in hospitals.

Design Enhanced surveillance of norovirus outbreaks between November 2009 and November 2011.

Methods Data were gathered during 149 outbreaks of norovirus in hospital wards from five hospitals in two major cities in England serving a population of two million. We used the time between the first two cases of each outbreak to estimate the serial interval for norovirus in this setting. This distribution and dates of illness onset were used to calculate epidemic trees for each outbreak. We then used a permutation test to assess whether proximity, for all outbreaks, was more extreme than would be expected by chance under the null hypothesis that proximity was not associated with transmission risk.

Results 65 outbreaks contained complete data on both onset dates and ward position. We estimated the serial interval to be 1.86 days (95% CI 1.6 to 2.2 days), and with this value found strong evidence to reject the null hypothesis that proximity was not significant (p<0.001). Sensitivity analysis using different values of the serial interval showed that there was evidence to reject the null hypothesis provided the assumed serial interval was less than 2.5 days.

Conclusions Our results provide evidence that patients occupying the same bay as patients with symptomatic norovirus infection are at an increased risk of becoming infected by these patients compared with patients elsewhere in the same ward.


This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Article summary

Article focus

  • Published literature on norovirus outbreaks does not provide clear evidence of the effectiveness of infection control measures.

  • Improved understanding of how norovirus spreads in closed environments could lead to better infection control procedures.

  • This study uses statistical modelling methods to assess whether patients in proximity are at increased risk of contracting norovirus during outbreaks in hospitals.

Key messages

  • We have shown a clear role of spatial proximity in the transmission of norovirus in hospital outbreaks.

  • Patients who are in the same bay as patients who become ill have a higher probability of becoming ill compared with patients in a different bay.

  • Increasing barriers to movement between bays by closing affected bays promptly would be effective in preventing further spread.

Strengths and limitations of this study

  • Provides an estimation of serial interval, and assessment of significance of patient proximity in spreading norovirus within hospitals.

  • Different modelling approaches showed consistent results.

  • A weakness is that although data collection were standardised it is often difficult to assess the accuracy of the information on patients’ positions on a ward.


Norovirus is the commonest cause of gastrointestinal infection worldwide.1 There are between two and three million cases occurring each year in the UK.2 ,3 Norovirus commonly presents as outbreaks of diarrhoea and vomiting and are frequently reported in hospitals, care-homes, schools and cruise ships.4 ,5 Outbreaks in hospitals are disruptive, often leading to ward or bay closures, staff sickness and cancelled operations.6 The cost of nosocomial outbreaks of norovirus to the National Health Service (NHS) in England was estimated at £115 million in 2002/2003.6 Recently the cost in one region in Scotland was estimated at £1.2 million in the two norovirus seasons from 2007 to 2008.7

Understanding the benefits of infection control measures is challenging, because they are usually instigated as a package with several measures being implemented during an outbreak. While these interventions are based on sound infection control principles, evaluating their efficacy in trials is difficult and the published literature on norovirus outbreaks does not provide clear evidence of the effectiveness of infection control measures.5 In observational studies early ward closure has been shown to shorten the mean duration of outbreaks.6 ,7 There is also evidence that vomiting and the resultant aerosols are important in transmitting the infection. People exposed to vomiting events, either by being close to the person who initially vomited, or by occupying the same area sometime after the initial event, have a higher infection risk.811 However, these analyses are based on single outbreaks or events that led to subsequent disease. Improved understanding of how norovirus spreads in closed environments could lead to better infection control procedures.

The aim of our study was to assess how spatial proximity to a norovirus case is associated with risk of acquiring symptomatic norovirus gastroenteritis. Our hypothesis was that patients sharing bays (small self-contained areas within wards) with patients with symptomatic norovirus infection were more likely to become infected compared with those who were in another bay or part of the affected ward.

Materials and methods


We carried out enhanced surveillance of norovirus outbreaks from all in-patient wards in five tertiary care hospitals serving two cities in England, with a combined catchment of approximately two million people.

Surveillance data

We collected data during outbreaks from individual patients on date of onset of illness, symptoms (diarrhoea and/or vomiting), last date of illness for each patient, location on the ward at the time of the patients’ symptoms onset (recorded as bed number and bay number) and also the ward type.

For two hospitals, information was recorded from January 2008 to November 2011 on specially designed forms that were completed by infection control staff and returned to the Health Protection Agency. Each month, we contacted the infection control lead at these hospitals asking about suspected or laboratory confirmed norovirus outbreaks and, if any had occurred and for the forms to be completed and returned. In three other hospitals the data were downloaded from a database on which infection control specialists had recorded these data items during outbreaks of norovirus occurring in the season of 2007/2008. Data from these three hospitals were downloaded during several visits to these hospitals. Data on outbreaks on norovirus were available from November 2007 to November 2011.

Patient location during outbreaks

We obtained ward plans for two of the five hospitals, which assisted in locating patients in the ward if only part of the information on patient location was recorded in the outbreak reports.


Outbreaks were defined as two or more cases of diarrhoea and or vomiting of infectious origin in a ward occurring within 2 days of the first case suspected or confirmed to be due to norovirus. All the hospitals in this study used PCR for detection of norovirus in stool samples.

A bay is a small self-contained area within a ward. Usually bays contain between two and eight beds. Bays are not the same as individual single bed occupied rooms. Proximity was defined as patients who share a bay.

Analytical framework

The analysis is based on a probabilistic reconstruction of chains of transmission (trees) based on the dates of illness onset for patients affected in outbreaks. It makes use of methods developed for Severe Acute Respiratory Syndrome (SARS) transmission and later applied to norovirus.1215 If we knew with certainty who acquired infection from whom it would be straightforward to quantify the role of proximity in norovirus outbreaks, for example, by using regression analysis. However, in practice, transmission events are unobserved, so instead we consider all possible infection trees consistent with the data. We used a previously described approach to calculate the probability, πij, that patient i was infected by patient j for each pair of infected patients in each outbreak based on onset times and the serial interval distribution (the serial interval is the time from onset of symptoms in case i to case j), without using proximity data. The serial interval distribution tells us the probability of durations of 0, 1, 2… days between onset in a case and onset in secondary cases infected by this case. Given multiple possible sources for a case, we can use knowledge of this distribution to tell us how likely each is to be the true source. Full technical details are described in Wallinga and Teunis.12

We then used the matrix of πij values to simulate 1000 possible infection trees for each outbreak, by assigning the infector of patient i to be patient j with probability πij.

In these simulations, we assumed that the case with the earliest onset time was the index case and had no infectors on the ward. If more than one patient had the earliest onset date in a given outbreak, we selected the index case from these patients with equal probability in each simulation. For each outbreak k, we used these 1000 simulations to produce a proximity metric, Pk, defined asEmbedded Image

where sijkl is equal to 1 if patient i was infected by patient j in simulation l of outbreak k and is zero otherwise. The pijk terms measure proximity between patients i and j in outbreak k. In this application, we consider this to be a binary variable equal to 1 if patients i and j occupied the same bay at the time of first symptom onset of these patients. An overall proximity metric, P, is obtained by summing the Pk values. The value of P (and of Pk for individual outbreaks) should be interpreted as a measure of how much transmission occurs between patients in the same bay.

If people in the same bay pose a greater risk of infecting each other this will tend to lead to larger values of the proximity metrics, P, and Pk. We compared this observed metric with values obtained if proximity was not associated with transmission. This distribution was derived by performing random permutations of the bays of the patients in each outbreak and calculating Pk as above for each outbreak. These values were again summed to give an overall proximity metric, S, when proximity was by assumption not an important factor. We repeated this for 1000 random permutations of the patient bays to obtain 1000 sampled proximity metrics. These 1000 S values therefore represent a sample from the distribution of proximity metrics that would be expected if proximity played no role in spreading the disease during the outbreak. By comparing the 1000 sampled values of S with the observed value P we can evaluate whether transmission is more (or less) likely to occur between patients in close proximity. If proximity is unimportant the observed value of P would be unlikely to be in the tails of the distribution of S. If proximity leads to increased transmission the observed value of P would tend to be greater than most of the sampled S values. If proximity leads to decreased transmission (which could occur as a result of enhanced hygiene measures, eg,) the observed value of P is likely to be smaller than most of the sampled S values. Formally, we can perform a two-sided hypothesis test with a null hypothesis that proximity is not important where the p value is given by the proportion of sampled S values which are the more extreme than the value observed, P.

Serial intervals

A key input for constructing the transmission trees is the serial interval. Our best estimate came from the observed distribution of the difference between first and second onset dates for each outbreak, and our primary analysis made use of this empirical serial interval distribution. Often, more than one patient was ill on the first day of the outbreak, so we used the first date of illness onset in the next patient(s) for calculating the serial interval. This gave a mean serial interval of 1.86 days (median 1 day, 95% CI 1.6 to 2.2 days, obtained by bootstrapping).

We also performed sensitivity analyses using different assumptions. First, we used an estimate from a study of a community outbreak of norovirus at a scouting jamboree, giving a mean serial interval of 3.6 days, with an assumed γ distribution.13 We then considered γ distributions with the same variance (4.1) but with the mean serial intervals varying between 0.5 and 5 days in half day increments. Data were analysed using R.16


Data were collected from 149 outbreaks in five hospitals between November 2007 and November 2011. These outbreaks affected 1694 patients and 456 staff. The average duration of the outbreaks, determined as the first date of onset to the last date of onset, was 8.9 days (median 8; range 1–40 days). Outbreaks affected an average of 11.4 patients (median 11; range 1–30) and an average of 3.2 staff (median 2; range 0–20). Data from these outbreaks gave a mean serial interval of 1.86 days. Figure 1 shows the distribution of serial intervals from the observed data and the nearest fitting γ distribution.

Figure 1

Serial interval distribution derived from onset dates of illness from observed outbreaks.

The spatial modelling analysis used data from 65 outbreaks where all data (for both onset dates and position in ward when taken ill) were complete. The outbreak characteristics were similar in these 65 outbreaks compared to the full dataset. The corresponding figures for these outbreaks were: average length of outbreak 9.5 days (median 8 days) average number of patients 11.9 (median 11) average number of affected staff 3.4 (median 2). The outbreaks affected various ward types, with most occurring in general medical wards (34%) and care of the elderly wards (28%). Other specialties were respiratory medicine (12%), stroke/neurology wards (11%), coronary care wards (9%) and orthopaedic/trauma wards (6%).

Proximity analysis

Figure 2 shows the observed proximity metric and the distribution of proximity metrics obtained under the assumption that proximity was not associated with transmission (from the simulated permutations). This shows how the proximity metrics observed relate to the distribution of proximity metrics if proximity was not important. The dashed line indicates the observed proximity metric (P) and the bars indicate the distribution of proximity metrics from the simulated permutations. For the model using the serial interval taken from the observed onset dates, the observed metric is outside of the range of the simulated distributions and is highly statistically significant (p=153.34, p<0.001). With serial intervals of less than 2 days proximity is either outside or at the extreme right of the simulated proximity metrics and the p values ranged from <0.001 for serial intervals of 0.5 days to 0.01 at a serial interval of 2 days. If we increase the assumed serial interval, the proximity metric moves to within the range expected from the simulated values, and at 3 days the p value was 0.2 (figure 2). Using the γ probability distribution derived from a community outbreak by Heijne et al13 (mean serial interval 3.06 days), the proportion of observed proximity values fell within the range that would be expected if proximity were not important.

Figure 2

Observed (dashed line) and distribution of expected (bars) proximity metrics for each serial interval.

The results show that the proximity metric (P) was larger than would be expected by chance under the null hypothesis (that proximity is not important) up to a serial interval of 2.5 days (p=0.05).


We have detected a strong association where patients who are in the same bay as patients who become ill have a higher probability of themselves becoming ill compared with patients in a different bay. In other words, transmission of norovirus infections is more likely to occur among patients sharing a bay, compared with transmission among patients in different bays. While this might at first seem an obvious finding, there are competing theories about the transmission of the virus in complex healthcare settings. For example, transmission might occur through staff transferring virus on their hands or patients touching infected surfaces with their hands when moving around the wards or the hospital. The strength of our conclusion is sensitive to the assumed serial interval distribution. We used values derived from the dates of onset of illness in patients during outbreaks on hospital wards. We also performed sensitivity analysis using a serial interval distribution derived from a study of norovirus in children.13 However, because the degree to which this generalises to a hospital setting is unclear (intuitively the high contact rates in hospitals would be expected to lead to shorter serial intervals)17 we explored serial intervals from 0.5 to 4 days, while constraining the variance. Our results show that for serial intervals of less than 2 days the observed effect of proximity (sharing a bay with someone else who was ill) is highly significant (p<0.001) and for serial intervals up to 2.5 days remained significant at the 5% level. This pattern was similar whether using the observed serial interval distribution from the outbreak data or using a parametric probability distribution.

Our study has some limitations. Although data collection were standardised it is often difficult to assess the accuracy of the date and place that patients were when they became ill. Specifically, accurate information on patients’ positions on a ward was available for 44% of the outbreaks. The spatial analysis was undertaken on 65 outbreaks. In addition to the sensitivity analysis we also analysed the data by including outbreaks where onset dates of illness were complete but data on patient location were incomplete (where fewer than 10% of patient data on position was incomplete, 85 outbreaks). We dealt with missing values by allocating a completely separate bay for patients with missing data on location at time of onset. This approach is conservative in that it would underestimate the impact of proximity. Second, we removed the patients from the outbreaks if positional information was missing. Despite this limitation, the additional models indicated that that the results are robust to different assumptions about missing data which is evidenced by slightly higher probabilities obtained when using records with complete information only (see online supplementary table S1 and figure S1). As a check to demonstrate that the results were not an artefact of the statistical methods, we also ran the models on data where patient position was randomly assigned. This showed no pattern and the proximity measures were not significant for any of these models. In our analysis the estimation of Pk depends on outbreak size. However, we are not interested in the absolute values of P, only in how the value of P calculated with real proximity data compares with the value calculated with randomly generated proximity data (based on a permutation of the bay identities) which will be affected in the same way by outbreak sizes. We also performed a sensitivity analysis, normalising Pk by dividing it by the number of branches in the transmission tree for each network. This gives equal weight to each outbreak and allows Pk to be interpreted as the probability that two linked cases were in the same bay. This did not change the results of the analysis; the P metric still fell well outside of the measure one would expect from the random simulations (p=0.004, data not shown).

We used more than one approach to modelling the infection trees because of the lack of data on serial interval in norovirus outbreaks. Heijne et al's method used data from child siblings at home. This was a useful starting point but is unlikely to be applicable to transmission in a hospital setting. Therefore, we derived γ distributions for serial intervals from 0.5 to 4 days. The average incubation period for norovirus is considered to be between 24 and 48 h.1 ,4 In our analysis the serial intervals of up to 2.5 days is likely to be a more appropriate time period in a hospital setting, than the analysis from Heijne et al.

Molecular analysis of stool samples could more definitively link outbreaks, which can help to reveal transmission networks.18 ,19 For example, in this study we have assumed that each ward outbreak was distinct, that is, all cases within a ward were part of a chain of transmission, but this may not necessarily be true. It is possible for multiple introductions to occur, and some outbreaks may have spread from one ward to another. Genetic characterisation of samples from each ward during possible multiple outbreaks of norovirus would shed light on transmission events and lead to further insight about the direction of transmission, including the possibility that the virus can be moved around the hospital.

Our study focused on patients rather than staff. Our hypothesis was that symptomatic patients who vomit are most likely to contaminate the area close to them and other patients in their vicinity. Obtaining data on staff movements is much more complicated and would only really be practical in a detailed prospective study.

The importance of spatial proximity in propagating transmission is consistent with other recent studies.15 ,20 One study which used similar methods to calculate the infection trees15 suggests that symptomatic individuals are likely to be the drivers of outbreaks of norovirus in hospital settings. Furthermore, the effective reproductive number was significantly higher for symptomatic patients compared with that for symptomatic staff. Norovirus transmission between people in close contact during sport, both within and between teams, has also been shown to occur21 as well as airborne transmission through explosive vomiting.22 One study demonstrated that successive staff working on an aircraft in which a member of the public had vomited also became sick.11

Norovirus has a low infectious dose4 ,23 ,24 shedding virus occurs during episodes of vomiting, where the virus can become aerosolised and expose others in the vicinity. Therefore, closing the bay quickly, preventing movement to and from that bay and immediately paying attention to cleaning areas nearby to initial vomiting events are likely to be effective in preventing further spread. The index of suspicion for patients who become ill should be high and implementing infection control interventions should not be delayed until the results of sampling are received, because this would increase morbidity and prolong the outbreak. New guidelines on controlling outbreaks of norovirus in hospitals and care homes recently released in the UK25 move away from the need to close wards and operate on a ‘manage within bays’ principle. Our study has shown that patients in proximity to symptomatic patients are at increased risk of becoming infected by these patients.


We have shown a clear role of spatial proximity in the transmission of norovirus in hospital outbreaks. Increasing barriers to movement between bays by closing affected bays promptly would be effective in preventing further spread.


We would like to thank the infection control staff at the NHS hospitals for providing the data used in this study.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:


  • Contributors JH, SO and BL designed the study and analysis was conducted by JH, BL and BC. BC and JH wrote the programming algorithm for statistical analysis. All authors were involved in interpretation of the data and drafting the article. All authors have read and approved the final manuscript.

  • Funding This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement There are no unpublished data. The programme for the statistical analysis is available in the online supplementary material.

Linked Articles

  • Correction
    British Medical Journal Publishing Group