Article Text


A controlled, before-and-after trial of an urban sanitation intervention to reduce enteric infections in children: research protocol for the Maputo Sanitation (MapSan) study, Mozambique
  1. Joe Brown1,
  2. Oliver Cumming2,
  3. Jamie Bartram3,
  4. Sandy Cairncross2,
  5. Jeroen Ensink2,
  6. David Holcomb3,
  7. Jackie Knee1,
  8. Peter Kolsky3,
  9. Kaida Liang3,
  10. Song Liang4,
  11. Rassul Nala5,
  12. Guy Norman6,
  13. Richard Rheingans4,
  14. Jill Stewart3,
  15. Olimpio Zavale7,
  16. Valentina Zuin8,
  17. Wolf-Peter Schmidt2
  1. 1School of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia, USA
  2. 2Department of Disease Control, London School of Hygiene and Tropical Medicine, London, UK
  3. 3Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, University of North Carolina—Chapel Hill, Chapel Hill, North Carolina, USA
  4. 4Department of Environmental and Global Health, University of Florida, Gainesville, Florida, USA
  5. 5Ministry of Health, Republic of Mozambique, Maputo, Mozambique
  6. 6Water and Sanitation for the Urban Poor, London, UK
  7. 7Health Research for Development, Maputo, Mozambique
  8. 8Emmett Interdisciplinary Program in Environment and Resources, Stanford University, Palo Alto, California, USA
  1. Correspondence to Dr Brown Joe; joe.brown{at}


Introduction Access to safe sanitation in low-income, informal settlements of Sub-Saharan Africa has not significantly improved since 1990. The combination of a high faecal-related disease burden and inadequate infrastructure suggests that investment in expanding sanitation access in densely populated urban slums can yield important public health gains. No rigorous, controlled intervention studies have evaluated the health effects of decentralised (non-sewerage) sanitation in an informal urban setting, despite the role that such technologies will likely play in scaling up access.

Methods and analysis We have designed a controlled, before-and-after (CBA) trial to estimate the health impacts of an urban sanitation intervention in informal neighbourhoods of Maputo, Mozambique, including an assessment of whether exposures and health outcomes vary by localised population density. The intervention consists of private pour-flush latrines (to septic tank) shared by multiple households in compounds or household clusters. We will measure objective health outcomes in approximately 760 children (380 children with household access to interventions, 380 matched controls using existing shared private latrines in poor sanitary conditions), at 2 time points: immediately before the intervention and at follow-up after 12 months. The primary outcome is combined prevalence of selected enteric infections among children under 5 years of age. Secondary outcome measures include soil-transmitted helminth (STH) reinfection in children following baseline deworming and prevalence of reported diarrhoeal disease. We will use exposure assessment, faecal source tracking, and microbial transmission modelling to examine whether and how routes of exposure for diarrhoeagenic pathogens and STHs change following introduction of effective sanitation.

Ethics Study protocols have been reviewed and approved by human subjects review boards at the London School of Hygiene and Tropical Medicine, the Georgia Institute of Technology, the University of North Carolina at Chapel Hill, and the Ministry of Health, Republic of Mozambique.

Trial registration number NCT02362932.

Statistics from

Strengths and limitations of this study

  • The first controlled health impact trial of an urban decentralised (non-sewerage) sanitation intervention.

  • Includes a matched control group.

  • The first sanitation health impact trial of shared sanitation.

  • The first sanitation health impact trial that uses a direct measure of enteric infections as a primary outcome measure.

  • The first sanitation health impact trial that includes a focus on localised population density.

  • As a controlled before-and-after (CBA) study, the risk of residual confounding of the effect of shared sanitation on enteric infections in children cannot be eliminated, particularly for confounding variables that change over time.

  • Interventions are not randomly allocated; the control group is selected on the basis of intervention criteria as articulated by the implementer.

  • Limited sample size prevents stratified analysis of some important variables, including shared latrine type and ages of children; the cohort now includes children from 29 days to 48 months of age at baseline, and risks of enteric infections may be very different if child age is more narrowly defined.

  • Limited follow-up (12 months) makes it unlikely that any effect on child growth or other later impacts will be discernable; we are collecting anthropometric data to provide a baseline for later studies in this cohort that may reveal these effects if present.


Africa is predominantly rural but rapidly urbanising. By 2050, Africa is projected to be 56% urban.1 As urban infrastructure may not expand to serve the needs of new urban residents emigrating from rural areas, informal, unplanned settlements are likely to persist in urban and periurban areas in the coming decades.2 These areas are characterised, in part, by a lack of basic services, overcrowding and high population density, substandard housing, unhealthy living conditions, insecure property tenure, lack of security, and poverty.3 ,4 Current estimates are that over 65% of urban residents in Sub-Saharan Africa (SSA) reside in such communities, but in some cities the proportion can reach as high as 85%.

Approximately 2.5 billion people lack access to basic ‘improved’ sanitation, with an estimated 756 million in urban areas,5 though this is likely to be an underestimate as slums are not always included in surveys. Despite progress in overall urban sanitation access coverage and equity as indicated by WHO/UNICEF Joint Monitoring Programme (JMP) metrics, residents of urbanising, unplanned communities in the large cities of SSA experience persistently elevated disease risks associated with poor sanitation infrastructure that does not reliably sequester excreta to prevent human exposure.6 ,7 Although open defaecation and ‘unimproved’ sanitation is less prevalent in urban settings in SSA than in rural settings, these categories apply to one-third of the urban population, and this proportion has not changed significantly in the past two decades.5 Although the proportion of the population without adequate sanitation may be lower in urban areas than in rural areas, the public health risks of unsafe excreta disposal can be much greater within a dense urban population compared with a low-density rural population where open defaecation occurs largely beyond the boundaries of human habitation. In terms of volume of excreta produced and probability of exposure, dense urban environments represent critical settings for targeted sanitation improvements.8 ,9

Unsafe water, sanitation and hygiene (WASH) conditions enable the transmission of enteric pathogens from infected individuals to new susceptible hosts through direct contact and/or via the environment. WASH-related enteric infections result in diarrhoeal diseases and may contribute to chronic inflammation of the gut,10 ,11 leading to reduced absorption of nutrients and malnutrition,12 ,13 environmental enteric dysfunction (EED),14 growth faltering and cognitive deficits,15–17 stunting,18 and death.19 These risks are borne predominantly by children. Although imprecisely known and context-specific, the burden of disease from poor sanitation specifically is likely to be high in many settings, since sanitation serves a primary barrier against faecal contamination of the environment20 and is associated with a range of outcomes.21

We are undertaking a controlled before-and-after (CBA) study of an urban sanitation intervention in Maputo, Mozambique, to address two primary questions as summarised below.

Question 1: Can urban, onsite, shared sanitation reduce risk of enteric infections in children?

This CBA trial is intended to estimate the effect of decentralised urban sanitation on the combined prevalence of enteric infections in children. The study aims to contribute to the evidence base for sanitation generally, but will make key contributions in three specific areas: urban sanitation, shared sanitation, and the reduction of enteric infections associated with sanitation improvements.

Urban, onsite sanitation

Recent and ongoing22 sanitation health-impact evaluations have focused on rural settings, partly because the majority of people lacking access to sanitation live in rural areas.5 ,23 Urban sanitation represents an important gap in the evidence base for sanitation health impacts generally, with few trials of sewerage,24 and no previous trials of urban, decentralised (onsite, non-reticulated) sanitation, despite the fact that decentralised solutions may play a critical role in the expansion of sanitation in informal settlements and rapidly urbanising areas where sewerage remains unaffordable or impractical25 due to insufficient sector financing26 and other constraints. Onsite systems (eg, pit latrines, septic tanks) are the most common type of sanitation in cities of low and middle income countries,6 yet receive little attention from policymakers and the wider sector in urban settings.25 We hypothesise that onsite systems that reliably sequester excreta from living environments will result in lower risk of enteric infections in children living in households served by these facilities.

Shared sanitation

Shared sanitation has not previously been studied in a prospective, controlled trial, despite its prevalence globally5 and especially in urban, SSA.27 There is some evidence that shared sanitation may be a risk factor for diarrhoea in children,28 ,29 though shared latrines (SLs) may or may not be less hygienic than individual household latrines.7 ,30 ,31 In this study, we compare new, hygienic, shared, private latrines with existing shared, private latrines serving household clusters or compounds in an urban slum environment. We hypothesise that when excreta is effectively contained, residents of households sharing the same space will experience reduced exposure to enteric pathogens, and therefore, children living in the immediate environment will be at reduced risk of enteric infections and disease.

Enteric infections

WASH-related diarrhoeal diseases cause an estimated 842 000 deaths/year.32 Recent meta-analyses have concluded that sanitation improvements reduce the morbidity burden significantly: pooled estimates of relative risks range from 0.72 (95% CI 0.59 to 0.88, ‘unimproved’ to ‘improved’)33 to 0.70 (95% CI 0.61 to 0.79, sewerage, with estimated greater impacts when the initial conditions are very poor).24 Similar reductions in risk have been estimated for soil-transmitted helminth (STHs) infection,34 ,35 which may impact over 5 billion people.36 Sanitation interventions have also been linked with reduced specific non-STH enteric infections, notably Giardia.37 ,38

Although the causal pathway from sanitation conditions to all downstream health outcomes (diarrhoea, stunting,39–41 death) includes enteric infections, multiple non-STH enteric infections have not generally been measured directly in large trials, partly due to the prohibitive cost of measuring multiple pathogens or logistical constraints. Also, aetiological work in sanitation trials has been mostly limited to case–control studies because asymptomatic shedding associated with persistent enteric infections was not thought to be as common as we now suspect it to be.42 We also know that recurrent or persistent infections, including asymptomatic infections, may be closely related to stunting, cognitive deficits, and other longer term effects of sanitation-related exposures.43–48

Question 2: Do enteric infection risks and the effects of urban sanitation vary by localised population density?

The ecology and transmission dynamics of sanitation-related infections may vary between rural, periurban and urban settings; between low-density and high-density urban areas; and across other important social, spatial and temporal gradients. The extent to which population density affects risks of sanitation-related exposures, outcomes and important covariates potentially affecting the relationship between sanitation and health is important but remains poorly characterised.

Although there is evidence that childhood mortality and overall health are improved in urban environments,49–52 on average, stark differences exist between the urban rich and the urban poor,53 with the urban poor often at higher risk of stunting and death compared with rural populations.54 In informal or unplanned settlements that lack critical infrastructure,55 urbanisation and high localised population density may be accompanied by an increased risk of person-to-person (and person–environment–person) transmission of infectious diseases56 ,57 (including enteric infections),58–62 increased exposures to chemical agents associated with urban environments, and constrained local resources (space, food, water),59 primarily impacting the poor. As water and sanitation infrastructure development often lags behind housing, some urban and periurban populations—especially those in informal/illegal settlements, slums or shantytowns—may be both more at risk of preventable diarrhoeal and other infectious diseases and less able to access infrastructure that would reduce risk.5 Using preintervention data on sanitation-related exposures and health outcomes, we will first determine whether there is an association between localised density and either sanitation-related exposures or outcomes after controlling for potential confounders.

There is some evidence that sanitation improvements can have a greater impact on enteric infections in densely populated urban informal settlements than in less densely populated communities,63 though it is also true that potential health gains will be greater where the baseline risks are high—and urban slums are associated with very high burdens of disease and death53—but the presence or extent of any effect modification remains uncharacterised. This study provides an opportunity to examine the relationship between population density, sanitation and health in urban communities where a range of localised population density exists, and to understand whether density is a potential effect modifier for enteric infection risk in this setting.

Research objectives and hypotheses

Objective 1: Measure the effect of a shared, onsite urban sanitation intervention programme on enteric infections in children. Hypothesis: the sanitation intervention reduces the risk of enteric infections among children in this setting.

Objective 2: Compare the risk of enteric infections in children in high-density versus low-density urban settlements, accounting for the influence of socioeconomic status, access WASH, and other covariates. Hypothesis: Population density independently influences risk of enteric infections in children after adjusting for confounding factors.

Objective 3: Compare the effect size of the intervention between high-density and low-density settings. Hypothesis: The effect of the described intervention on the prevalence of enteric pathogens is modified by localised population density after adjustment for confounding factors.

Objective 4: Explore whether and how the changes in sanitation coverage influence transmission dynamics of different pathogens in high-density versus low-density settings. Hypothesis: Sanitation access and population density are key determinants for the transmission dynamics of enteric pathogens.

Methods and analysis

Overview of study design

This will be a CBA intervention study, as described in the online supplementary information. On the basis of demographic information obtained in previous studies in this setting,64 ,65 and the planned scope of the intervention (190 latrines serving an average of 20 people each or 3800 people in total), we expect the intervention group to include at between 456 and 494 (12–13%) children under 5 years of age (from 29 days to 48 months of age at enrolment). We expect to enrol at least 380 children in the intervention and control groups at baseline and then to follow-up all children after the intervention is completed, with the goal of obtaining baseline and follow-up data for a minimum of 345 children in each group.

CBA designs are one of a number of non-randomised options available to study the impacts of sanitation on health.66 Previous examples of its use in WASH research include an analysis of the effect of municipal drinking water treatment enhancement (filtration) on the prevalence of cryptosporidiosis among persons with AIDS in Los Angeles County,67 and an evaluation of the effects of an urban sewerage system on childhood diarrhoea in Tehran.68 The principal analytical approach is to compare the postintervention outcome measures between intervention and control groups, with the baseline values of outcome measures serving to adjust the effect size for potential baseline differences. The relative advantages and disadvantages of this study design for WASH health impact studies have been described more fully elsewhere.69 Assuming that confounding variables do not change over time, this approach enables reasonable control for confounding, as household differences affecting the postintervention outcome measures should also affect the baseline measures of the same outcomes. The study design can also be described as a non-randomised cluster-allocated trial.

Study setting

Sanitation coverage and access in Maputo

An estimated, 56% of the population in urban areas of Mozambique (∼4.5 million people) lacked access to basic improved sanitation facilities in 2012,5 and with about two-thirds of Mozambique's population growth between now and 2050 projected to be in urban areas,70 access to safe sanitation facilities in such settings will continue to be a critical challenge. Only an estimated 26% of the total volume of faecal waste generated in Maputo is safely managed. Approximately 89% of households use onsite disposal of waste (10% have access to sewerage; an estimated 1% practice open defaecation).6 Shared sanitation represents 8% of urban sanitation in Mozambique overall, primarily among the poorest households in unplanned communities.5 Lack of access to safe sanitation is acute in the informal settlements and periurban areas of Maputo, resulting in frequent outbreaks of cholera, widespread enteric infections,71 and elevated child mortality.72

Description of the sanitation intervention

This study is an evaluation of a sanitation project supported by the Japan Social Development Fund (JSDF). This project is led by the World Bank's Water & Sanitation Program, with Water and Sanitation for the Urban Poor (WSUP) as co-partner with primary responsibility for implementation of the onsite sanitation component and construction of shared sanitation facilities. Current estimates are that approximately 30 communal sanitation blocks (CSBs, with multiple cabins/drop-holes) and an estimated 160 SLs (with one cabin/drop-hole) will be constructed across 11 neighbourhoods that exhibit diversity across density and other characteristics (susceptibility to flooding, relative poverty, important water and sanitation infrastructure access). These interventions are shared private latrines serving a minimum of 15 people (occasionally as few as 11), with flush toilets to septic tanks (non-sewerage). CSBs also include other amenities, such as laundry and washing facilities, and serve a minimum of 21 people. There are seven SL designs in current use in this programme, with the number of cabins based on a user capacity of approximately 20 users per drop-hole and the septic tank size and effluent drainage based on the maximum number of people being served.

Eligibility and enrolment

Eligible children will belong to households either receiving the intervention or in a selected control site (see below). Eligible children will be between 29 days and up to 48 months at enrolment to ensure that children are not neonates and that all children are under 60 months at the end line survey. The age of children will be verified, where possible, through birth or immunisation records supplied by the caregiver. We will collect compound, household-level, and individual data from all consenting households with one or more eligible children. All children in the target age range from eligible households will be recruited at baseline. Shared facilities will be handed over to users beginning in January 2015, with construction of the estimated 190 latrines following progressively over the next 8 months.

WSUP site selection criteria for implementation of new shared sanitation stipulates that sites must meet the following conditions: (1) sites should be within the predefined project geographical scope; (2) residents must be currently using shared sanitation in poor condition, based on inspection by WSUP engineers; (3) sites must usually meet WSUP criteria for a minimum number of beneficiaries (15 for SLs, 21 for CSBs); (4) sites must have a legal piped water connection nearby for possible use with pour-flush latrines; (5) residents must convey stated demand for improved sanitation and have a stated interest in contributing to cost: 10% total cost of the CSBs or 15% of the cost of SLs, divided by the beneficiary households and over 12 months following the start of construction; (6) sites must have available space to implement the new facility (often replacing the space occupied by existing shared facilities); and (7) sites must be accessible for transport of materials during construction and to allow for later tank emptying. Prospective project sites are then ranked according to additional criteria, including prospective community cost recovery; cross-cutting issues (gender, disabilities, HIV); environmental conditions relevant to installation of the septic tank, including sufficiently deep water table, acceptable infiltration rate of soil for soakaway pit/drain field for septic effluent, and flooding; and users’ commitment to be responsible for operation and maintenance through a formalised management committee.

Control selection

At minimum, all control sites will meet site eligibility criteria 2–5, though for criterion 5—stated demand and willingness to contribute towards cost—we will use only stated demand for improved sanitation, presented as a hypothetical. We will progressively enrol control sites as we recruit intervention sites to balance study arms by time of enrolment, and will stratify sampling by cluster size as well. For example, the simplest, single-cabin, shared facility serves a minimum of 15 and a maximum of 20 individuals. After enrolling an intervention site of this size, we will enrol a control site based on the same population range suitable for that latrine type: existing shared sanitation serving 15–20 people. Therefore, the control group is effectively matched based on both size of the compound and time of enrolment.

Control sites may be sites from within the 11 project bairros that fail to meet criteria 6 or 7 above, or in similar urban bairros of Maputo where shared sanitation is found, and site criteria 2–5 are met. Bairros that are projected to receive large infrastructure projects or other sanitation improvements during the proposed study period, if any, will be excluded from control selection, though none is currently projected.


Single-dose albendazole (a broad-spectrum deworming treatment) will be offered to everyone in intervention and control households (and their entire compounds) after the baseline and immediately following installation of the functional sanitation intervention, with the exception of pregnant women and children under 12 months. Ministry of Health (MISAU) staff, working within the National Deworming Campaign (NDC), will administer albendazole in accordance with MISAU standard safety protocols and dosage guidelines. Current dosing guidelines are to administer 400 mg for all ages (except for children between 12 and 24 months, who receive 200 mg in a suspension). Deworming will occur after the baseline collection of stool samples, and as close as possible to the date of latrine handover to the compound, up to 14 days after latrines are open for use.

Measuring density

The study setting will be urban bairros of Maputo with a range of neighbourhood and point-level densities. As an exclusively urban setting, the variability at the neighbourhood level is estimated to be between 5000 people/km2 for least-dense bairros up to approximately 50 000 people/km2.73 Localised measures of density may be lower and higher than these estimates. We will use the following measures of localised density: (1) number of (a) total people and (b) children under 48 months at baseline within 50 m of the centre of a given household, measured in a direct line, with all members of any household touched by the line included; (2) number of (a) total people and (b) children under 48 months within 100 m of the centre of a given household, measured in a direct line, with all members of any household touched by the line included; and (3) number of people and children under 48 months within the shared space as defined by a household cluster (self-defined common area of households sharing the latrine), measured using a GPS-enabled area mapping tool. The first two methods use a rooftop-area algorithm that is calibrated using survey data and high-resolution satellite imagery to estimate numbers of people.

Intervention fidelity and adherence

This trial is intended to estimate health impacts of a ‘real world’ urban sanitation intervention, as-delivered, similar to other recent sanitation trials.38 ,74 ,75 The investigators have no influence over the nature or schedule of the intervention. We will collect data directly and indirectly on the delivery of the intervention and its use among the target population, to estimate both fidelity and adherence, with the dual goals of (1) documenting the process of conducting the research by tracking facilitating factors, barriers faced, and methods for overcoming challenges and (2) developing evidence-based recommendations to inform investments, policies and implementation of sanitation interventions to achieve the greatest health impact. We will estimate coverage and use of SLs for both groups (intervention and control) and across all sites, expressed as percentages of: planned latrines constructed, intended recipients having unlimited access to SLs, and population who report exclusive use at home.

Primary outcome measure: prevalence of enteric infections in children

A few key pathogens have been identified as important aetiological agents of moderate-to-severe diarrhoeal disease affecting children in Mozambique.42 According to data from the Global Enteric Multicenter Study (GEMS) study site in Manhiça, Mozambique, among children under 11 months (n=374), 32% of moderate-to-severe diarrhoea cases were due to Rotavirus, 13% to Cryptosporidium, and 4% to adenovirus 40/41. For children aged 12–24 months (n=194 cases), Cryptosporidium and enterotoxigenic Escherichia coli (ETEC) each accounted for 9% of cases, followed by Shigella (6%), Rotavirus (5%), and Campylobacter (2%). In older children, aged 24–59 months (n=112), Shigella caused the most cases (17%) followed by Vibrio cholerae O1 (8%). Subclinical infections are widely prevalent among children living in faecally contaminated settings.76–78 In the GEMS study, at least one enteric pathogen was identified in the stool of 72% of control (asymptomatic) children, compared with 83% of symptomatic cases. Two or more pathogens were detected in 45% of cases and 31% of control children, suggesting high prevalence of coinfection.

We define our primary outcome as combined prevalence of the following enteric infections in stool samples from children: Campylobacter; Clostridium difficile, Toxin A/B; Es. coli O157; ETEC LT/ST;79 Shiga-like toxin producing Es. coli (STEC) stx1/stx2; Salmonella; Shigella; V. cholerae; Yersinia enterocolitica; Giardia; Cryptosporidium; and Entamoeba histolytica. We are using a recently developed but extensively tested80–84 multiplex molecular pathogen assay. These and other enteric infections are increasingly thought to be related to EED and stunting.85 Additionally, we will detect viral pathogens adenovirus 40/41, Norovirus GI/GII, and Rotavirus A, though we have excluded viral infections from the primary outcome definition.

Secondary outcomes

Combined prevalence of STH infection in children

Prevalence of STH infections across Mozambique is high among school-aged children. A nationally representative survey in 2009 (n=83 331 children of mean age 11.4 years) indicated STH prevalence for Ascaris lumbricoides to be 65.8%, followed by Trichuris trichiura (54.0%), hookworm (38.7%), Enterobius vermicularis (24.8%), Taenia spp (5.8%) and Hymenolepis nana (5.2%).86 STH infection prevalence in school-aged children in Maputo province was 37.1%. Globally, preschool-aged children living in STH-endemic areas may also be at high risk of STH infection37 leading to malnutrition, growth faltering, delayed cognitive development, intestinal obstruction, micronutrient deficiency, and other effects.87

In our study, we will assess STH infection in all stool samples collected at baseline and approximately 12 months following the intervention. Specifically, we will measure combined prevalence of the following STHs: A. lumbricoides, T. trichiura, and hookworm by the Kato-Katz method.88 We will additionally measure En. vermicularis, Taenia spp, Hymenolepis spp, and Strongyloides stercoralis in all stool samples.

Self-reported gastrointestinal illness

We will collect caregiver-reported symptom data on all enrolled children from all households at baseline and end line visits, including diarrhoea, vomiting, abdominal pain, and refusal to eat. We define diarrhoea as ≥3 loose or liquid stools in a 24 h period, or any stools with blood, as reported by the child's caregiver.89 We will use a 7-day recall period.90 ,91

Tertiary outcome measures

Tertiary outcome measures include: EED biomarkers neopterin, α-antitrypsin, and myeloperoxidase in stools;92 all-cause mortality in children under 5 years at end line, as reported by caregivers;93 height and weight by standard protocols94 to calculate weight-for-age, length-for-age, and weight-for-height z-scores to classify wasting, stunting and underweight, respectively;95–98 and secretory insulin-like growth factor 1, a linear growth marker that has been shown to increase in children with better nutrition99 and micronutrients (eg, zinc),100 and to be associated with chronic inflammation and stunting.101 ,102

Environmental exposures

We will collect environmental exposure indicator data from a subset of compounds as matched preintervention and postintervention samples, including a subsample from each of the following categories: (1) upper density quintile intervention, (2) lowest density quintile intervention, (3) upper density quintile control and (4) lowest density quintile control. We seek to characterise exposures that may be related to density in three domains of transmission: the household, the compound, and the immediate area around the compound. For this reason, we have selected three key exposure indicators that we have developed in other studies: (1) household water samples, as an indicator of household hygiene; (2) soil samples from key locations near latrines; and (3) fly samples, from key household and compound locations. We will measure E. coli across all environmental samples, with direct pathogen measures80 and microbial source tracking in a subset of samples.103–108 Human-specific faecal markers that performed well in a recent multilaboratory comparison105 will be validated against local human and animal stool to select the markers that best balance sensitivity and specificity in the study location, with the goal of the relative exposure risks from various routes of transmission.109–111 Additionally, we will sample surfaces in multiple locations across the compounds to assess fine-scale spatial trends in exposure.112 ,113 Geostatistical interpolation of the spatially referenced samples will allow us to identify areas in the compound that are prone to faecal contamination.114

Sample size: primary outcome

One hundred and ninety shared sanitation sites are planned for construction, serving a mean of an estimated minimum of 20 persons each. GEMS data have suggested high prevalence of asymptomatic infections in children under 5 years of age in high-risk areas;42 other studies have shown specific infections to rapidly increase to high prevalence after birth.115 For the primary objective, we assume that end line prevalence of enteric infection will be 70% in the control arm and about 59% in the intervention arm (RR=0.84). Assuming 80% power, we use a standard equation for computation of sample size116 resulting in 314 children per arm, ignoring clustering. We assume that there will be two or more children per cluster. Owing to the low mean cluster size, the clustering effect is likely to be small. Assuming an intraclass correlation coefficient (ICC) of 0.1,38 and a design effect of 1.1, we compute a sample size of 345 children per arm. Assuming that about 10% of children will be lost to follow-up, results in a sample size of 380 children per arm.

For the analysis of effect modification by localised population density, we calculate a detectable difference of 15% between higher density areas (assumed prevalence of 66.5% for the primary outcome) compared with lower density areas (51.5%), postintervention and adjusted for baseline imbalances (80% power, ICC=0.08), and assuming the cut point for the analysis is the median (to optimise statistical efficiency). The sample size calculation for measuring the differences in the effect of the intervention by population density is subject to some uncertainty, however, as the distribution of population densities is not known. Possibly, the median is an inappropriate cut point for this analysis, and groups of different sizes may be compared; other preanalysis cut points will be identified during the analysis of baseline data, as described below, reducing the detectable difference between density strata.

Sample size: secondary outcomes

For the secondary outcome measures, we make the same assumptions on study power, loss to follow-up and design effect as above. For worm infections we will combine Ascaris, hookworm and Trichuris prevalence. Previous surveys in Maputo suggested a 37% of STH in school-aged children.86 Our study is enrolling participants from poor backgrounds where STH prevalence is likely to be closer to the national average of over 60%. However, our sample is including only preschool children where prevalence may only be half this figure. In line with surveys from other high-prevalence settings, we assume that in the control arm, reinfection after deworming will be 30% or near baseline levels.117 We will be able to detect a difference of 10%, that is, an incidence of reinfection of 20% in the intervention arm.

For self-reported diarrhoea, we assume a 7-day period prevalence of 9% in children under age 5. This figure was found in a previous large-scale WSUP-funded survey in six bairros in Maputo.65 Our study is focusing on socially more deprived households than this survey (which included any household with a child under age 5). This prevalence may therefore be conservative. Assuming a 7-day period prevalence of diarrhoea of 9% in the control arm, we can detect a reduction to 3.5% in the intervention arm (RR=0.4) with 80% power.


Given that this is a study of a complex intervention and a complex outcome, data analysis requires a balance between methodological simplicity and statistical models which may be more sophisticated but which rely on many assumptions whose validity may be unknown. We will use several analytical methods of different degrees of sophistication.

Objective 1: estimating the effect of the intervention on disease risk

We will use log-binomial (prevalence data), linear regression (continuous outcomes after log-transformation), or negative binomial models (count data such as STH counts) to compare disease risk between intervention and control areas. Clustering by compound will be accounted for by generalised estimating equations or random effects depending on whether the data meet the assumptions for these methods.118 To account for baseline imbalances between intervention arms we will adjust all analyses for the baseline disease risks.

Objective 2: the association between population density and baseline disease risk

Following on the objective 1 analysis, we will create categories of different population densities to explore whether the effect of population density is linear (which is unlikely), followed by cubic spline models to explore non-linear trends. Following the baseline, we will examine the distribution of population densities and the relationship between the primary outcome variable and point-level density. We will then propose one or more additional candidate cut points for the stratified analysis based on any identified discontinuities in this relationship (ie, natural cut points that appear to represent a change in exposure levels across the stratum of density). The chosen cut points will then be carried forward to the main analysis postintervention to explore the potential effect modification of the intervention impact by population density. If no obvious cut points can be identified based on the distribution of population density or the association between density and risk, then we will stratify according to three cut points: (1) lower than median, compared with median, and higher density; (2) lower tertile of density compared with upper tertile of density; and (3) highest quintile of density compared with lowest quintile of density.

Objective 3: explore interaction between intervention effect and population density

We will use the same regression models as under objective 2 to examine the interaction between intervention and population density using the a priori divisions (eg, median, tertiles, quintiles) and a prespecified cut-off point identified under objective 2.

Objective 4: study transmission pathways and changes in transmission pathways following the intervention

At baseline, we will construct a structural equation model (SEM) including a range of socioeconomic, demographic, and water/sanitation/hygiene factors as specified in the conceptual framework. First, a single model will be constructed at baseline using data from intervention and control arms combined. The model will then be applied to intervention and control arms separately to study baseline comparability of disease transmission and factors influencing disease risk. We will modify the conceptual framework and the model resulting from it until we achieve a good model fit. The conventional approach of prespecifying the causal pathways and then testing the whole framework using SEM is not needed in our analysis since we can make use of the baseline data. The model that gives the best fit to the data will then be prespecified as the model to be used for exploring changes in transmission pathways and the differential effect of sanitation on outcomes according to population density. Given the many options available for analysing the data, prespecifying the model based on the baseline data prior to exploring the main objective will minimise the risk of choosing the models that give the ‘nicest,’ rather than the most valid results. At follow-up, we will use the chosen model and apply it separately to the data from intervention and control arms. This will allow us to explore whether the intervention is associated with any changes in disease determinants and strengths of different pathways compared with (1) the baseline and (2) the control arm postintervention.

Objective 4 will further be met by developing dynamic transmission models to integrate data collected from the study sites to further assess the impacts of water and sanitation on the transmission of enteric pathogens and STHs among children. We propose to use two integrated modelling approaches to assess the impact of sanitation interventions on pathogen transmission. First, for enteric pathogens, we will employ a classic SEIR (susceptible-exposed-infected-recovered) framework to model transmission via multiple pathways;119 for STHs, following modelling framework previously developed for schistosomiasis,120 ,121 transmission models will be developed to track mean infection intensity (eg, mean worm burden). For both models, environmental factors and their impacts will be specifically considered. The models will then be used to assess the impacts of population density on transmission of different pathogens under different scenarios of sanitation interventions. To complement the population-based models, a second approach will use an agent-based model (ABM) to characterise explicit socioeconomic heterogeneity in household behaviours and conditions, and individual-level susceptibility, and their impacts on the sanitation-mediated transmission.


Potential confounders

There are numerous confounders in all observational studies on the association between potentially poverty-related variables (including population density) and enteric infections. We reduce the potential for confounding by frequency matching on the basis of intervention siting criteria, cluster size, and time of enrolment. For the main analysis, comparison of the prevalence of intestinal pathogens between intervention and control postintervention, the effect of confounders that are constant over time will be further minimised by controlling for baseline values in outcome variables. Adjusting the analysis of the follow-up outcome measures for the baseline values of the same measures is likely to be superior to all other methods of achieving balance between two groups in situations where randomisation is not possible.62 Most potential confounding variables that are associated with the outcome and sanitation conditions are likely to reflect socioeconomic status and education, which are factors that are inherently difficult to define and measure. Unmeasured or imprecisely measured confounders will produce residual confounding and limit the validity of multivariate analysis. By contrast, adjusting directly for the baseline value of the outcome measure using the same methodology at baseline and follow-up usually means much better control for confounding, as the potential confounders (whether or not measured) are likely to influence the values of the two variables similarly. The risk of residual confounding is a concern largely in the analysis of the cross-sectional baseline data, where we explore the association between population density and infection without accounting for temporal trends. Multivariate analysis, supported by SEM, will be used to address confounding. However, the risk of residual confounding cannot be completely eliminated.


Several sources of bias are possible in this study design. Our main method of minimising reporting bias lies in using stool samples that are collected independent of symptoms for the assessment of the primary outcomes, rather than reported illness symptoms. Therefore, our outcome markers should not be subject to reporting bias. Taking samples only from symptomatic individuals risks losing objectivity, as the propensity to report symptoms may be influenced by receiving an intervention or not. While the team collecting the stool samples may not be unaware of allocation status, the laboratory staff analysing the stool samples will be kept fully blinded.

The risk of reporting bias may be low because of the non-randomised nature of the study. Study households will be told that they are participating in a general health survey without revealing the specific aims of the study. It seems unlikely that the enrolled households will under-report or over-report disease, as our team is not directly linked to WSUP and does not require formal agreement of the households to be randomised. In an ongoing London School of Hygiene and Tropical Medicine (LSHTM)/WSUP evaluation of a sanitation programme in six bairros of Maputo (3701 households), data collection is carried out by an independent local consulting firm, and is presented to study households as a general health and demographic survey; thus far, there is no evidence that households relate the study to the intervention.

In a study requiring complex analytical methods, there is a risk of bias due to arbitrarily choosing between different statistical modelling approaches and included variables. As outlined above, we will minimise this bias by prespecifying the main model for the primary objective based on the baseline analysis and prior to the follow-up round. Further, we will establish a steering committee that includes experienced senior scientists from the main disciplines involved in this research: sanitation, epidemiology, microbiology and statistics.


The authors gratefully acknowledge critical input on the design from the TR Action Technical Advisory Group (TAG), including helpful comments by Benjamin Arnold and John Colford (UC-Berkeley) on the design; Peter Hawkins (WSP); Carla Costa and Vasco Parente of WSUP-Mozambique; Seth Irish (CDC); Veronica Casmo (MISAU); Aaron Bivins, Katie Poynter, and Trent Sumner (GT); and Ellen De Bruijn (We Consult). They also inform that this study is made possible by the support of the American People through the United States Agency for International Development (USAID).


View Abstract
  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Twitter Follow Joe Brown at @BrownResearchGT

  • Contributors JBr, OC, PK, JBa, KL, SC, GN and W-PS conceived and designed the study. VZ, W-PS, JK, OZ and JBr designed the data collection tools. RN, JK, JE and JBr developed laboratory protocols. W-PS, RR and SL wrote the analysis plan. JS, JBr, JK and DH designed the environmental data collection plan. JBr, OC and W-PS drafted the manuscript. All the authors contributed to editing and revising of the manuscript.

  • Funding This study was funded by the United States Agency for International Development under Translating Research into Action, Cooperative Agreement No. GHS-A-00-09-00015-00.

  • Competing interests None declared.

  • Ethics approval The study protocol has been reviewed and approved by human subjects research and ethical review boards at the London School of Hygiene and Tropical Medicine, the Georgia Institute of Technology, the University of North Carolina at Chapel Hill; and the Comité Nacional de Bioética para a Saúde (National Bioethics Committee for Health) of the Ministério da Saúde, Republic of Mozambique.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement The data collected in this study—with personal identifying data redacted—will be available for use by investigators on request following publication of primary trial results.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.