Article Text

Protocol for the PATHOME study: a cohort study on urban societal development and the ecology of enteric disease transmission among infants, domestic animals and the environment
  1. Kelly K Baker1,
  2. Sheillah Simiyu2,
  3. Phylis Busienei2,
  4. Fanta D Gutema1,
  5. Bonphace Okoth2,
  6. John Agira2,
  7. Christine S Amondi2,
  8. Abdhalah Ziraba3,
  9. Alexis G Kapanka1,
  10. Abisola Osinuga1,4,
  11. Collins Ouma5,
  12. Daniel K Sewell6,
  13. Sabin Gaire6,
  14. Innocent K Tumwebaze2,
  15. Blessing Mberu2
  1. 1 Department of Occupational and Environmental Health, The University of Iowa College of Public Health, Iowa City, Iowa, USA
  2. 2 Division of Population Dynamics and Urbanization, African Population and Health Research Center, Nairobi, Kenya
  3. 3 Division of Health and Wellbeing, African Population and Health Research Center, Nairobi, Kenya
  4. 4 The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
  5. 5 Maseno University, Maseno, Nyanza, Kenya
  6. 6 Department of Biostatistics, The University of Iowa College of Public Health, Iowa City, Iowa, USA
  1. Correspondence to Dr Kelly K Baker; kelly-k-baker{at}


Introduction Global morbidity from enteric infections and diarrhoea remains high in children in low-income and middle-income countries, despite significant investment over recent decades in health systems and water and sanitation infrastructure. Other types of societal development may be required to reduce disease burden. Ecological research on the influence of household and neighbourhood societal development on pathogen transmission dynamics between humans, animals and the environment could identify more effective strategies for preventing enteric infections.

Methods and analysis The ‘enteric pathome’—that is, the communities of viral, bacterial and parasitic pathogens transmitted from human and animal faeces through the environment is taxonomically complex in high burden settings. This integrated cohort-exposure assessment study leverages natural socioeconomic spectrums of development to study how pathome complexity is influenced by household and neighbourhood infrastructure and hygiene conditions. We are enrolling under 12-month-old children in low-income and middle-income neighbourhoods of two Kenyan cities (Nairobi and Kisumu) into a ‘short-cohort’ study involving repeat testing of child faeces for enteric pathogens. A mid-study exposure assessment documenting infrastructural, behavioural, spatial, climate, environmental and zoonotic factors characterises pathogen exposure pathways in household and neighbourhood settings. These data will be used to inform and validate statistical and agent-based models (ABM) that identify individual or combined intervention strategies for reducing multipathogen transmission between humans, animals and environment in urban Kenya.

Ethics and dissemination The protocols for human subjects’ research were approved by Institutional Review Boards at the University of Iowa (ID-202004606) and AMREF Health Africa (ID-ESRC P887/2020), and a national permit was obtained from the Kenya National Commission for Science Technology and Innovation (ID# P/21/8441). The study was registered on (Identifier: NCT05322655) and is in pre-results stage. Protocols for research on animals were approved by the University of Iowa Animal Care and Use Committee (ID 0042302).

  • Gastrointestinal infections
  • Infection control
  • Public health
  • Neglected Diseases
  • Epidemiology

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Integrating a transdisciplinary exposure assessment study into a cohort study of paediatric infections provides new evidence on household and neighbourhood developmental conditions that are associated with detection of enteric pathogens in humans, animals and environment in two Kenyan cities.

  • We compare low-income and middle-class households and neighbourhoods to test counterfactual theories about meeting basic developmental standards to reduce pathogen transmission.

  • Our data collection uses objective methods to document socioeconomic, weather, infrastructural, spatial, behavioural, environmental, zoonotic and human data, including use of both selective culture and molecular methods to characterise pathogen community patterns.

  • The observational study design is vulnerable to unmeasured confounders and the short cohort 14-day observation period cannot account for within-child variability in exposures and outcomes across seasons.

  • The living conditions in middle-class households and neighbourhoods may not offset hygiene conditions in the overall urban environment enough to alter enteric pathogen transmission patterns.


Study rationale

In the past half century, advances in access to healthcare and early childhood vaccines and therapies both in high-income countries (HICs) and in low-to-middle income countries (LMICs) have successfully reduced global mortality from diarrhoeal diseases. According to Global Burden of Disease data, diarrhoea mortality in children under 5 years decreased by 59.3% (from 173.3 per 100000 to 70.6 per 100 000) between 2000 and 2016, largely in South Asia and sub-Saharan Africa where fatality rates were highest.1 However, reductions in diarrhoea incidence have lagged with only a 12.7% reduction (from 2.0 per child year to 1.75 per child year) over the same 16 years, and large differences remain between incidence in children in HICs (0.58 per child year) versus regions like Latin America and sub-Saharan Africa (2.82 cases and 2.37 cases per child year, respectively). Typhoid and paratyphoid enteric fever cases in endemic areas also remain high at 14.3 million cases per year.2 Asymptomatic infection prevalence is even higher than symptomatic rates,3 and symptomatic and asymptomatic infections can elevate a child’s susceptibility to coinfection,4 enteric dysfunction and malnutrition,5 6 and long-term cognitive and developmental stunting.7 8 Greater investment in disease prevention and control programmes that protect children from pathogen exposure will be needed to accelerate the reduction in global enteric disease burden and achieve health equity between HIC and LMIC populations.

At the end of the 19th century, typhoid fever, cholera and diarrhoea epidemics routinely plagued cities like London and Philadelphia in today’s HICs.9 Annualised death rates were quite similar to rates in LMICs today, although with stronger seasonal patterns.10 Death rates dramatically decreased in the late 19th to early 20th centuries, even as urban populations were growing rapidly and well before the development of vaccines.11 Development of municipal piped water and wastewater systems contributed to these health transitions, although inference from these studies is limited by the lack of model adjustment for confounding from other causal pathways (food safety, handwashing, housing quality).12–14 Water, Sanitation and Hygiene (WASH) has occupied a central strategy in global development policy over the last 50 years for reducing global diarrhoea burden and resulted in impressive increases between 1990 and 2015 in global population with access to household drinking water and to a lesser extent sanitation infrastructure. These investments have improved quality of life globally, but as stated above have not generated expected declines in diarrhoeal incidence in LMICs.15 Randomised controlled studies reporting relatively small benefits from household-level improvements in WASH and recent demographic studies raise questions as to what other conditions are required for disease control.16–20 Many WASH interventions tested in low-income communities are low cost and simple to increase the chance of sustained use by the population after trial end. More advanced but higher cost interventions could have had greater impact. Achieving population-wide water and sanitation coverage, infrastructure maintenance and chlorination of a water supply may be necessary, but not sufficient for disease control.19–24

Control of enteric pathogen transmission may require more wide sweeping development of societal sanitary and hygiene infrastructure and policy beyond drinking water and sanitation systems. The building of water and sanitation systems in HIC cities was just one piece of a much larger societal reformation that included parallel improvements in health systems, housing quality, waste disposal laws, awareness of personal hygiene, growth of a middle class and better nutrition.25 The scientists who were directly observing health transitions between 1850 and 1950 proposed diverse theories for the collapse of enteric disease epidemics. Some studies noted improvements in water sources in Glasgow and London correlated poorly with observed declines in diarrhoea rates over time, with major decreases in disease occurring decades after improved water sources and decades before sewerage.10 26 Contamination of milk sources was considered a cause of typhoid and scarlet fever outbreaks.27 28 Widespread regulation of milk pasteurisation also emerged in the late 19th century and has been attributed with decreases in diarrhoea case rates in Philadelphia and New York City in the 1900s.29 30 Others suggested sharing of unsanitary living conditions rather than sewerage problems caused typhoid fever cases.31 One comprehensive study examined housing quality, shared living conditions, food supply and type of water and toilet in early 1900s Mansfield, England, which had a seasonal diarrhoea prevalence of 10% despite high sewered sanitation and water supply coverage.32 It supported the premise that shared housing and unhygienic household habits contributed most to diarrhoea prevalence and noted that middle class households in otherwise crowded areas with poor neighbourhood hygiene were not protected from infection.

Evolutionary biology evidence suggests that the establishment of permanent agricultural population centres around 10 000 years ago triggered preindustrial epidemiological transitions that increased pathogen adaptation to humans and multiparasitism in human populations. Migration and urbanisation further concentrated animals, humans and their waste into relatively small areas, increased pathogen host adaptation, and increased rates of within-species and across-species transmission of many pathogenic species.33 Over time the number of households in HICs choosing to own animals and the types of animals own has shifted as a byproduct of changing lifestyles, technology and limited space and need for animal rearing. The importance of animals to urban industry has also shifted. Brownlee and Young (1922) reported strong consensus at scientific society meetings with the theory that the replacement of horses and gravel road surfaces with motorised vehicles and tarmac prior to 1910, reduced visible animal faeces in the streets and coincided strongly in time with disease declines. Strong temporal correlation between decreases in annual issued licenses for horse-drawn vehicles in English and Welsh cities and declines in diarrhoeal deaths provides support for that premise.34 Fly control may have further enhanced the community hygiene-driven declines in typhoid fever and diarrhoeal deaths.35 Changes in contact between humans, work animals and other domestic species that prevented zoonotic transmission could also explain early epidemiological transitions.

Elimination of cholera, typhoid fever and other enteric disease may have required a combination of societal developmental changes that collectively attained a standard of environmental hygiene capable of preventing transmission of infectious agents—essentially a microbial community effect. Unfortunately, evidence on which aspects of development had the greatest impact on enteric disease ecology during historical transitions cannot be disentangled because records on the timing, rate and scale of coverage of most societal development activities, and personal adoption of hygiene behaviours are fragmented and poorly documented.36 Which persons benefitted from those improved living conditions is also poorly recorded, requiring studies to rely on aggregated health records until the middle of the 20th century.14 Records on outcomes are predominantly death records for cholera and typhoid due to inconsistent healthcare seeking and recording of non-fatal cases and investigations of other enteric causes of diarrhoeal deaths. Contemporary studies examining societal development and multidecade enteric disease rates in countries that have recently undergone epidemiological transitions would be more culturally relevant, but such studies are also lacking for many of the same data availability challenges.


A study applying One Health approaches to examining ongoing health transitions in high burden countries of the 21st century could better inform which developmental factors most influence enteric disease prevention and control for prioritising health programme policy. To address this knowledge gap, we describe the scientific rationale, hypotheses, and design for the Pathogen Transmission and Health Outcome Models of Enteric Disease (PATHOME) study supported by the National Institutes of Health (NIH) through the Evolution and Ecology of Infectious Disease funding mechanism (1 R01 TW011795, years 2020–2025). The objectives of PATHOME are to (1) characterise interactions among enteric pathogenic agents in infants, domestic animals and the environment and the dynamic interactions between infants, caregivers, animals and environmental materials across space and seasons; and (2) develop statistical and computational methods to examine this complex disease system and predict which social and environmental developmental improvements best prevent multi-pathogen transmission in urbanising areas of a high disease burden country. General household developmental improvements hypothesised to mitigate F-Diagram pathways of transmission37 include improved water and sanitation access, hand hygiene behaviours, paved cleanable flooring, domestic animal faeces management, vector control and use of pasteurised milk and safe food storage and refrigeration. General neighbourhood improvements hypothesised to prevent pathogen spread include prevention of stormwater runoff from public to household areas, presence of human and/or animal faeces in residential areas, and proportion of households in a community with functioning on plot improved water and sanitation facilities. More nuanced types of interventions will be explored depending on availability of discriminatory data for deeper comparisons. Our overall hypothesis is that joint modelling of ‘enteric pathome’ agents (ie, the microbial communities of viral, bacterial and protozoan pathogens transmitted by human and animal faeces) across households and neighbourhoods that represent contrasts in urban societal development will show that development leads to lower pathogen-specific detection frequencies, and thus evolution of the pathome from complex to simple microbial community structure distributions in humans, animals, and the environment (figure 1). Our study design is motivated by several fundamental science theories (table 1).

Table 1

Study design strategies and scientific rationale for the Pathogen Transmission and Health Outcome Models of Enteric Disease study

Figure 1

Conceptual model of the enteric Pathogen Transmission and Health Outcome Models of Enteric Disease (PATHOME) Study. Societal development of household water, latrines, flooring, animal health and hygiene conditions and neighbourhood infrastructure (drainage) and hygiene (animal penning, waste regulations) conditions triggers an evolutionary change in the enteric PATHOME from taxonomically complex to simple microbial community detection patterns in humans, animals and the environment.

Methods and analysis

Study setting

The study is situated in low-income and middle-income communities in two different Kenyan cities, Nairobi (population 4.4 million persons) and Kisumu (population~610 000 in 2019) where our team has performed prior health research.38–51 The prevalence of diarrhoea in children in Kenya is high (~15%) with most cases occurring in children under 3 years of age, and roughly 25% of children are severely stunted.52 However, diarrhoea prevalence is much lower in Kenya’s rapidly growing middle class,50 indicating that epidemiological transitions in enteric disease risk are occurring along low-to middle-income neighbourhood gradients. Communities in each city were selected based on similarity in population’s ethnic and cultural characteristics but with distinct differences in overall community developmental conditions. Environmental hygiene indicators also differ by neighbourhood socioeconomic status and thus are expected to be different in the two target cities. Private toilets are the norm in middle-income neighbourhoods, while the majority of households in low-income areas rely on shared latrines for disposal of human faeces and a quarter of households dispose of faeces in open defecation sites, drains, or waste dumps.53 Low-income households53 54 and neighbourhoods55 are heavily polluted with indicator bacteria and enteric pathogens, while contamination of middle-income neighbourhoods is unknown. Half of households in Kisumu keep domestic animals,56 while animal ownership is less common in low- and middle- income Nairobi neighbourhoods. While development or the lack thereof may be uniform across communities, in urban settings there may be outlier households that differ from their neighbours, such as relatively wealthier households in low-income communities or relatively poorer households in middle-income communities. The communities selected for this study offer the opportunity to enrol these types of households and study how a spectrum of household conditions might influence transmission dynamics in the context of broader community conditions.

Study design

To capture exposure-outcome relationships reflecting pathogen transmission dynamics, we are conducting a prospective 14-day cohort epidemiological study with repeat health outcome assessment and integrated exposure assessment data collection. We are recruiting 32 children in 4 age groups (<3 months, 3 to <6 months, 6 to <9 months, 9 to 12 months) on a rolling basis over 2 consecutive years across wet and dry seasons (12 months) in low-income and middle-income neighbourhoods of Kisumu and Nairobi for a total of 248 households with complete data assuming a 10% attrition rate (figure 2). Enrolment in Nairobi spans November 2021 to November 2022, with the last participant follow-up in December 2022. Enrolment in Kisumu spans late March 2023 to February 2024 with the last participant follow-up expected in March 2024. Participant recruitment follows a standardised schedule of 16 households over a 4-week period, eight each from the low-income and middle-income neighbourhood, followed by a 1-week break in enrolment for neighborhood-level observations and environmental sampling. This 5-week cycle is repeated 12 times or until sample size requirements for each neighbourhood are met.

Figure 2

Low-income and middle-income study neighbourhoods in Nairobi and Kisumu, Kenya, where households with 0-to-12 month of age children are enrolled into the Pathogen Transmission and Health Outcome Models of Enteric Disease (PATHOME) study. Yellow bars represent 100-m distance. Map image generated using Google Earth 2022. Intant image licensed to KKB on 22 November 2022 through Vectorstock Standard License.

Eligibility, participant recruitment and enrolment

The classification of neighbourhoods as ‘low’ versus ‘middle’ class and thus eligibility as a site in either study group was based on broad sector understanding of low-income high-burden neighbourhoods in each city and prior knowledge of the African Population Health and Research Center (APHRC) about neighbourhoods with high levels of formal occupation, WASH access and low poverty. Middle-class neighbourhoods considered for the study were visited to qualitatively verify improvements in drainage, cleanliness, road width and condition, housing construction and other factors. Some neighbourhoods were purposefully excluded because the contrasts in wealth seemed too large, for example, gated and guarded high rise condominiums, homes and communities. Any household located in selected neighbourhoods that had an eligible infant was eligible for the study.

Eligibility is defined as infants between 0 and 12 months of age, as verified by birth registration card. Infants with disabilities that would impact normal behaviour are excluded. Study of infant age groups also provides an opportunity to examine pathogen transmission dynamics related to development of motor skills that allow infants to explore their environment.52 Participants are identified from study neighbourhoods by extracting infant age and the primary caregiver’s name and contact information from monthly surveillance records collected routinely by Kenyan Ministry of Health Community Health Volunteers (CHVs) in each neighbourhood. Per their standard practice, CHVs approach all households within their area of work with pregnant women or young children to ensure they receive health information and have support to access healthcare. Recruitment data are compiled as separate lists for the low-income and middle-income neighbourhoods of each city. When CHVs identify potentially eligible households, they notify study staff of the APHRC who follow-up with households in the company of the CHV to verify information about age-eligible children.

The information sheet/form explains the study activities and the purpose for them, as well as risk and benefits, costs for participation (none), right to withdraw and compensation. The adult/primary caregiver is informed about all study activities by the field staff who read the information on the form to the caregiver in English or Swahili, in the presence of their CHV as a witness, and answers any questions they may have. Consent forms are signed by the caregiver with signature or thumb print and by their CHV witness if the caregiver is not in a position to sign themselves. A copy of the information sheet and consent form is provided for the caregiver’s permanent records. After informed consent is completed, the study team schedules a time to begin data collection and provides the caregiver with a package of size-appropriate diapers with instructions to collect infant faeces specimens over the subsequent 2 weeks. Collection of a baseline or day 1 faeces sample is required for continued enrolment in the study. Should consent or an enrolment faeces sample not be provided, the research team and CHV identifies another infant of same age group and neighbourhood from the list for recruitment. Biological sex of the infant is not an enrolment criterion and does not influence recruitment although we expect the sex of the study population to be representative of natural sex distributions in urban Kenya. Households are provided one pack of diapers and flour and sugar to make porridge (items identified by caregivers in a pilot study as being highly desired that directly benefit the child) as compensation for their time at the end of enrolment (approximately US$6).

Study outcomes

The primary outcomes of the epidemiological study are detection and codetection of specific types of enteric pathogens in infant faeces during the enrolment period. The three approaches for examining pathogen detection are:

  1. Pathogen prevalence—the proportion of infants with a specific type of enteric virus, bacteria or parasite detected in their faeces at enrolment (day 1).

  2. Pathogen incidence—the proportion of infants who have a specific type of enteric virus, bacteria or parasite detected in their faeces in the 13 days after enrolment that was not present in their faeces at enrolment (days 2–14).

  3. Pathogen diversity—the number of unique types of enteric virus, bacteria or parasite detected in infant faeces at enrolment (day 1).

The study is evaluating the following secondary outcomes:

  1. 14-day prevalence of prospective self-reported diarrhoeal symptoms—proportion of caregivers who report an enrolled infant had three or more loose faeces, with or without blood or mucus, in any 24-hour period during the enrolment period (days 1–14).

  2. 7-day prevalence of self-reported diarrhoeal and other symptoms prior to enrolment—proportion of caregivers reporting at enrolment that in the past 7 days the enrolled infant had symptoms of: malaise, vomiting, difficulty in breathing, unusual lack of appetite or willingness to take liquids, fever, runny nose, cough, rash (pre-day 1).

  3. Preterm birth prevalence—proportion of infants born before 37-week gestational age (day 1).

  4. Stunting prevalence—proportion of infants whose length-for-age z-score (centimetres) is −2 or more SD away from the WHO international reference standard57 during the enrolment period (days 3 and 7).

  5. Wasting prevalence—proportion of infants whose weight-for-age z-score (kilograms) is −2 or more SD away from the WHO international Reference standard57 (days 3 and 7).

  6. Mid-upper arm circumference (days 3 and 7).

The study is evaluating the following Intermediate Pathogen Exposure Outcomes:

  1. Frequency of child exposure behaviours—number of infant contacts with caregivers, other children, domestic animals and the environment per unit of time documented by structured observation (days 2, 3).

  2. Presence and concentration of Escherichia coli and pathogenic subtypes, Salmonella enterica, Shigella spp., Campylobacter jejuni/coli and Listeria monocytogenes in household (day 2) and public (monthly) environments.

  3. Prevalence and concentration of enteric viruses, bacteria and protozoan pathogens in faeces of domestic animals (day 2).

Sample size determination

We ran a simulation study to compute the power of our study to determine a difference in mean pathogen diversity (ie, number of distinct pathogens codetected in an infant’s stool) in infants living in low-income versus middle-income neighbourhoods. We fit a negative binomial (NB) distribution to pathogen diversity data taken from the Safe Start study58 on 22-week-old infants in Kisumu, Kenya and used this in our simulation study as the true distribution of pathogen diversity. In our simulation study, we fit NB regression models both with and without neighbourhood effects using a weakly regularising prior on the regression coefficients and compared the two models via Bayes factors. In our simulation study, we had 90% (76%) power to detect substantial (Bayes factor >3.2) (strong evidence; Bayes factor >10) of a 45% increase in the expected number of pathogens codetected in infant stools in low-income neighbourhoods compared with middle-income neighbourhoods; we had 96% (90%) power to detect with substantial evidence (strong evidence) a 50% increase in pathogen diversity.59 With no true differences between low-income and middle-income neighbourhoods, our study has 99.5% (100%) probability of failing to reject the null hypothesis of no neighbourhood differences based on requiring substantial (strong) evidence, while obtaining 99% (93%) probability of obtaining substantial evidence (strong evidence) of no neighbourhood differences.

We performed a second simulation study to evaluate the power to determine differences in mean pathogen concentration in public domain samples using a linear mixed effects regression model with weakly regularising priors. Assuming a spatial correlation coefficient of 0.2, we had 80% power to detect a Cohen’s D of 0.65 (0.75) with substantial (strong) evidence. When there was no true difference between low-income and middle-income neighbourhoods, our study has an achieved 100% probability of failing to rejecting the null hypothesis of no neighbourhood differences based on either substantial or strong evidence, while obtaining 94% (60%) probability of obtaining substantial evidence (strong evidence) of no neighbourhood differences.

Data collection

PATHOME methods are summarised below and in figure 3, with detailed protocols provided sequentially in online supplemental materials and at

Supplemental material

Figure 3

Study design and timeline for Pathogen Transmission and Health Outcome Models of Enteric Disease household data and biological specimen collection. Symbols matching data collection activity placed under days when activities are implemented. Survey=self-reported data on household conditions. Diaper=provision of diapers for faeces collection. Infant=days of diaper/faeces collection. Chicken=animal faeces collected. Binoculars=structured observation days. Geotracker=24-hour time window where spatial movement is monitored. Toy ball=days of environmental sample collection. Mid-upper Arm Circumference tape=anthropometric data. Calendar=window of time for prospective self-reporting of infant diarrhoea symptoms. Attribution for tape measure image: Simon A. Eugster, Wikimedia Commons Creative Commons Attribution ShareAlike 3.0 CC BY-SA 3.0 ( Image of child licensed through Vectorstock Standard License. The image of a chicken is a Public Domain CCO. The open-source image of ‘binoculars’ is licensed under the open-source Unlicense license. Diaper, survey, tracker and calendar images taken by or created by authors.

Household data collection

Over the 14-day enrolment window, households participate in a series of data collection activities as shown in figure 3. Between study enrolment and onset of data collection, the field team provides the caregiver a pack of disposable diapers sized by the infant’s weight, and asks her to use them immediately and for the duration of the 14-day study, providing any containing infant faeces to the study team.46 On day 1, our field staff collect the initial baseline diaper with infant faeces and conduct a survey on socioeconomic conditions, behaviours and health in the household. Caregivers are also given a 14-day self-reported diarrhoea symptom calendar that spans enrolment to study completion to record diarrhoeal symptoms for the infant. On day 2, field staff return for the first day of exposure assessment activities involving a 5-hour period of structured observation, the placement of a geotracking device on infants and their domestic animals (if available), and collection of domestic animal faeces and household environmental samples. The caregiver is reminded to continue using the diapers to collect infant faeces and to record symptoms, if any, on the calendar. The geotracker is left on infants and animals for a 24-hour period spanning the morning of day 2 to mid-afternoon on day 3. On day 3, field staff repeat 5-hour structured observation, collect a day 3 faeces specimen, document anthropometric measurements of the participating infant, and retrieve the geotracking devices just before departure. Field staff return to the household on days 5, 7 and 14 to collect three more faeces specimens (5 faeces samples total). On day 7, the second anthropometric measurements are also documented. A second household soil/swab is also collected on this day. On day 14, the enumerators return to the households to collect the diarrhoea calendar, the last infant stool sample and provide caregivers the compensation gift.

Household survey

On day 1, caregivers of infants are asked to respond to a 45-min survey that documents demographics of persons living in the household, number and type of caregivers, wealth assets and income, household and compound infrastructure including access to drinking water sources and latrines, ownership and management of domestic animals, breastfeeding and feeding practices, method of handling infant and animal faeces, handwashing practices, 7-day history of diarrhoea and other symptoms in the infant and other household members and related healthcare utilisation or self-administered treatments (eg, antibiotics). Developmental indicators (eg, type of wealth assets and income, household flooring, formal community stormwater drains) were chosen for their ability to serve as culturally relevant and discriminatory measures of upward socioeconomic mobility in Kenya.

Prospective 14-day diarrhoea calendar

The caregiver is given a 14-day pictorial diary with images of a sick and healthy infant and asked to mark any days in the next 2 weeks when the infant experiences 3 or more loose water or bloody faeces in a 24-hour period. Enumerators and CHVs review the calendar on scheduled visits and prompt the caregiver to complete the record. On day 14, the calendar is replaced with a gift of a 12-month calendar.

Infant faeces collection

Five faeces samples are collected per subject (days 1, 3, 5, 7 and 14, yielding a total of 1240 infant faeces samples (≥248 infants * five faeces samples)). If staff return as scheduled but no faeces is available, they are allowed to return anytime in the next 24 hours (eg, day 4 for day 3) to fulfil the scheduled faeces sample plan. Diapers are used to collect infant faeces to protect the faeces from contamination by soil (eg, scooped from ground) or faeces of other children (eg, potties in use by multiple children). If the infant defecates, the caregiver places it in a provided sterile, Ziploc bag and stores it in a safe, shaded place. The field team collects any diapers with faeces during visits in a cooler with ice packs and transfers them to a supervisor for transport to a central lab for analysis (APHRC lab in Nairobi and Maseno University lab in Kisumu).

Infant and domestic animal spatial geotracking

Data on the spatial–temporal movement of infants and domestic animals between household and public areas, and frequency and duration of colocalisation that could lead to infant–animal interactions is measured by placing GPS mobile data loggers on infants and animals for a 24-hour period.60 61 The detailed development and validation of the geotracking protocol for infants and animals will be summarised in a separate manuscript. In brief, on day 2 enumerators place a small (28×11 inch, 1.23 ounce) GPS Tractive Pet Tracker (Tractive Co., Austria) linked to an online real-time logging account on the infant. Enumerators work jointly with caregivers to identify an unobtrusive place for attaching the geotracker to infants, typically a snug child-sized wrist or ankle band or in a pocket of the infant’s clothing. Caregivers are assured that the devices do not record audio or visual information.62 They are asked to move the location of the geotracker as necessary to ensure child comfort, minimise concerns of theft or visibility, changes in clothes or other concerns, while keeping the tracker within arm’s reach of the infant.63 64 This tracker is placed prior to structured observation to allow enumerators to monitor infant discomfort or reactivity over the 5-hour observation window and assist with placement fit strategies if necessary. This also provides parallel observation and satellite-based location data for examining the validity of methods for recording human behaviour. The geotrackers remain placed for 24 hours, although if the caregiver choses to remove it from the infant for sleeping, she is asked to place the device next to, but out of reach of the infant and then replace the device when the infant wakes up.

An additional one or two geotrackers are attached to harnesses (chicken/duck, cat) and collars (dogs, goats, sheep, cow) of domestic animals belonging to the enrolled household. Enumerators monitor the animal for species-specific signs of distress for 15 min, and then turn the device on, which begins real-time upload of date, time and longitude and latitude data to the web. If the animal does not adjust to the harness, then the harness is removed, and another animal, if available, is selected for observation. If the household owns multiple animals, staff prioritise placing geotrackers on two different species that most commonly come into contact with the infant over two animals of the same species or animals kept apart from infants and household areas. Geotrackers are set to transmit location data to satellites in real-time at 5-min intervals. The geotrackers are recovered after 24 hours on day 3 for cleaning, recharging and reuse. Data are extracted from the Tractive server via an api and merged by geotracker ID with other household data for cleaning and summarisation of movement and location of the wearer. A selected number of caregivers will be selected from the study sites for in depth interviews about their experiences with the geo tracking exercise. These caregivers will be randomly selected from households where tracking has been done for both animals and infants. The tool will contain questions about their experiences with the tracking, experiences of the infants and animals, placement of the tracker, their fears/concerns and recommendations for improvement.

Structured observation of infant and caregiver behaviors

After placing the geotrackers, enumerators begin 5 hours of structured observation of the enrolled child and their caregiver using a prepiloted tool developed by our team using the Livetrak (developed by Stanford University) app for tablet data collection.39 62 65 Tools document where the child is taken (public and household settings), what the infant is doing (eg, sleeping, crawling, standing), and the rate of the child touching and mouthing soil, objects, food, drinking water, surface water, animals or their faeces and hands or other body parts of other children and caregivers. Tools also document the frequency of mitigating behaviours performed for infants, such as cleaning after defecation and infant faeces disposal, cleaning face or hands, and hand washing.65 In cases of multiple caregivers for children, the caregiver who is interacting with the child at any given time is recorded to capture differences in behaviour by caregiver.45 Data will be summarised by frequency of different observations as well as the sequence of behaviours.

Domestic animal faeces collection

Up to two specimens of animal faeces representing different animal species are collected in each household, equating to 496 animal faeces samples (248 households * 2 animal faeces per household). Animal faeces could include faeces observed in animal enclosures if a household-owned animals or faeces on the ground that may have been from personal or neighbor-owned animals. Enumerators will attempt to sample faeces on day 2, at the same time as observations, geotracking and environmental sample collection. However, if no faeces are observed on day 2, enumerators can inspect the household on subsequent days when they return to collect infant faeces and then collect samples. If no animal faeces are observed over the 14-day period, then the household will be considered uncontaminated by domestic animal faeces. Faeces specimens are collected into sterile WhirlPak bags, from the centre of the pile to avoid contamination by soil, surface water, or other animal faeces. These samples are stored in a cooler on ice packs and transported, as stated above, to the central lab for analysis.

Household environmental samples

In each participating household, we collect two soil samples (days 2 and 7), one caregiver hand rinse, one toy rinse, and in Kisumu only, one drinking water sample. Soil samples are about a 30 g composite of soil from different locations in the household or a swab of a 30×30 cm squared floor space.66 Toy rinses aim to measure the amount of microbial contamination that can be transferred onto household fomites in 24 hours. This is measured by providing the caregiver an alcohol sterilised infant toy, such as teething ring or rattle, on day 1 for the infant to play with for 24 hours.67 This infant toy is retrieved 24 hours later on day 2 into a sterile barcode labelled bag for transport to the lab for microbial testing. An identical replacement toy is given to the infant for permanent keeping. At the end of behavioural observation on day 2, one caregiver hand rinse is collected by submerging each hand into a barcode-labelled sterile 1-L WhirlPak bag with phosphate-buffered saline and massaging all surfaces of the hand for 30 s before switching to the next hand. In total, we expect to collect 496 household soils, 248 toy rinses and 248 hand rinses across all neighbourhoods and cities, plus up to 124 drinking water samples (when water is consumed directly or indirectly via food preparation) in Kisumu. All samples are stored on ice packs and transported to the APHRC (Nairobi) or Maseno University (Kisumu) lab for processing and microbial analysis on the day of collection.

Infant birth status and anthropometry data

Preterm birth status is recorded by extracting gestational week of age from the infant’s national birth registration booklet if possible, using caregiver self-report if the booklet is not available. Infant weight at the point of enrolment is measured using a digital scale. Infant length is measured using length boards per standard WHO guidelines.57 These measurements are transformed into Z scores and categorised as wasted and stunted based on whether their Z score is <2 SD below international standard for weight and length for an infant of the same gender and weeks of age. If the infant’s booklet contains a record on infant weight and length at birth, this information is also recorded and used to calculate the absolute difference in weight and length from birth to enrolment age per week for comparison of average growth rates by city, neighbourhood status and other factors.

Neighbourhood data collection

Environmental hygiene observations and soil and water collection

One neighbourhood location is identified every 4 weeks in each study neighbourhood using feedback from field staff as to locations where infants enrolled in the past 4-week cycle were observed outside of their private households. If no infants left their homes during structured observation in the last 4 weeks, we consult the geotracker data to identify locations where the tracker placed on the infant was found outside of household property. Repeating this process each month (12 months) in two neighbourhoods in two cities results in 48 neighbourhood areas where enrolled infants may have been exposed to neighbourhood soil and water across a seasonal year. The designated locations are characterised by recording the types of urban infrastructure present, the presence of observed human waste disposal (diapers, flying toilets, individual piles, emptied buckets), and the presence of animal faeces. Infrastructure includes small to large wastewater drains, paved roads, residences, electricity lines, community parks, businesses and waste dumps. During neighbourhood environmental observations, eight soil samples and up to four surface water samples are collected at each site to assess for pathogen contamination. The eight soil samples and four water samples are collected approximately 2 m apart as previously described.55

Seasonality and weather data

Climate events like precipitation can cause small to large-scale flooding that carries pathogens across the environment and contributes to their persistance in the environment. Middle class neighbourhoods may have better infrastructure for preventing and managing run-off, so land-based precipitation estimates are likely inaccurate for measuring differences in urban flood exposure. Our initial analysis will use field observations of flooding as well as survey information reporting household flooding from neighbourhood areas, including drains, as a climate-based exposure indicator. A supplemental grant to the parent grant is supporting the development of strategies to generate neighborhood-specific satellite data on cumulative precipitation and flooding from multiple satellite-based hydrometerological online platforms over the 12-month time of data collection in each city for assessment of seasonal contributions to pathogen detection patterns. Quantitative measures of satellite-based flooding events will be reported in a separate manuscript and added to PATHOME models at a later date if we can validate that satellite predictions discriminate between neighbourhoods in ways that match ground-based observations.

Sample processing and microbial assays

The overall sample processing strategy is shown below and in figure 4 with detailed methods in online supplemental materials.

Figure 4

Flow diagram of the strategy for recovery, detection, quantification and confirmation of enteric pathogens in human and animal faeces and environmental fomites. Buffered Peptone water (BPW); Rappaport Vassiliadis Soya Peptone Broth (RVS); Half Fraser Broth (HFB); Bolton Broth (BB); modified Xylose Lysine Deoxycholate (mXLD); quantitative reverse transcription polymerase chain reaction (qRT-PCR); multiplex PCR (mPCR). Image of child licensed through Vectorstock Standard License. The image of a chicken is a Public Domain CCO. Soil, water, and house images created by authors.

Faeces processing

In a biosafety cabinet, a 300 mg portion of each faeces specimen is transferred from the diaper into DNA/RNA Shield Collection Tubes containing ceramic beads and Shield cell lysis buffer (Zymo Research Corp, California, USA), spiked with a 105 concentration of an extrinsic control for monitoring extraction efficiency, and extracted using the ZymoBIOMICS Quick DNA/RNA kit (Zymo Research Corp, California, USA). An extrinsic negative control tube of Shield buffer is processed twice a week to monitor for potential laboratory contamination.

Quantitative molecular detection of enteric pathogens

DNA and RNA from infant and animal faeces is analysed as previously described using microfluidic TaqMan Array cards containing primers and probes for Norovirus GI and GII, adenovirus 40/41, Sapovirus, Enterovirus, Rotavirus, Salmonella enterica, Shigella/EIEC backbone, Shigella/EIEC plasmid, Campylobacter jejuni/coli, Listeria monocytogenes, Clostridium difficile, Helicobacter pylori, enterotoxigenic Escherichia coli LT/ST, enteropathogenic E. coli, enteroaggregative E. coli and Shiga-like toxin-producing E. coli stx1/stx2, Enterocytozoon bieneasi, Giardia, Cryptosporidium spp., Entamoeba histolytica and the MS2 extrinsic control. Most of these pathogens were selected based on prior evidence that they are commonly transmitted in Kenya, with a few like H. pylori and L. monocytogenes being selected based on potential of transmission.68–70 The TaqMan assays are run on a QuantiStudio 12K Flex Real-Time PCR System (ThermoFisher, Chicago, Illinois).

Prior to using the TaqMan Array cards, the assays are validated using 10-fold serial dilutions (100–105 genes per sample) of positive controls. Bacterial positive controls are prepared by culture of bacterial strains maintained at the University of Iowa. Other positive controls were prepared using qBlocks gene fragments (IDT, Coralville, Iowa, USA) with pathogen-specific sequence inserts targeted by the primer and probes. One molecular water PCR control for every box of 25 cards and 2 negative extraction controls per 100 samples is included to verify no contamination is introduced during sample preparation. Two field process negative controls per week are collected on days when samples are placed into Zymo tubes. These results are examined to verify MS2 extrinsic controls are positive but other genes are negative. Any faeces samples processed on the same day as controls with false positive amplification are repeated by reisolating and extracting faeces from the biorepository at the central labs. Inhibition is assessed by comparing cycle threshold (Ct) of MS2 to known concentration spiked into faeces prior to extraction, plus calculating 260:280 and 230:260 ratios on a Nanodrop. The recovery efficiency is calculated by dividing the quantity of MS2 by the quantity of MS2 spiked into the sample prior to concentration.

Selective detection and quantification of environmental bacteria

Presence and concentration of Salmonella, Shigella, Campylobacter, Listeria and E. coli spp. in household soils, hand rinses, toy rinses, drinking water (Kisumu only), and neighbourhood soil and water is measured using methods adapted from ISO and Food and Drug Administration food microbiology guidelines and relevant literature, and a most probable number (MPN) method that involves primary and secondary pre-enrichment and selective culture of serially diluted sample volumes.

First, three different volumes of soil, hand rinse, toy rinse and water are measured into Buffered peptone water pre-enrichment broth (BPW, Himedia, India). Except for public surface water samples, all samples are subjected to 1:10, 1:100 and 1: 1000 dilutions of the original sample by measuring 25, 2.5 and 0.25 (g/ml) of samples into 225, 250 and 250 mL BPW, respectively. For water samples, assuming high likely chance of harbouring diverse and high bacterial load, first a 10-fold dilution (1 mL sample: 9 mL BPW ratio) of public water surface sample is made from which subsequent serial dilutions (1:100 and 1:1000) are performed by serially transferring 1 mL from the first dilution into tubes containing 9 mL of BPW to yield 1:100, and 1:1000 dilutions. All the three dilutions are incubated at 37°C for 24 hours to recover sublethally injured bacteria.

Second, a 100 µL aliquot from each of the three BPW primary enrichment culture is transferred into 10 mL of selective secondary enrichment media using Rappaport Vassiliadis Soya Peptone Broth (RVS, Himedia, India) for Salmonella and Shigella spp., E. coli broth (Oxoid, UK) for E. coli spp., Bolton Broth (BB, Oxoid, UK) for Campylobacter spp., and Half Fraser Broth (HFB, Oxoid, UK) for Listeria monocytogenes. These are incubated at a growth temperature recommended for each pathogen (eg, Salmonella spp. at 41.5°C for 24 hours).

Third, a loopful of each 10-fold sample dilution in secondary enrichment broth culture is streaked on selective agar and grown overnight to isolate specific strains, including steaking RVS on modified XLD (mXLD, Oxoid, UK) for Salmonella and Shigella spp., E. coli broth on Eosin methylene blue or MacConkey agar (Oxoid, UK) for E. coli, BB on mCCDA (Oxoid, UK) for Campylobacter spp., and FB on ALOA (Oxoid, UK) for Listeria monocytogenes. These are incubated at 37°C for 24 hours. Incubation of Campylobacter spp. occurs in anaerobic chambers with gas packs (Mitsubishi gas Chemical America, Japan) except for the primary enrichment.

Fourth, up to 10 representative colonies of each distinct phenotype are collected from each plate, subcultured into tryptic soya broth (Himedia, India) and incubated at 37°C for 24 hours and then preserved as glycerol stocks at 1:2 ratio for future reisolation and analysis. For DNA extraction, 5–10 representative colonies are boiled in 100 µL molecular grade water. DNA samples from presumptive positive bacteria are stored at −20°C until pathogen status can be confirmed by qRT-PCR. Concentration of presumptive bacterial species is determined by examining which sample cultures across the three volumes of environmental sample tested for each bacterial protocols contained in presumptive positive colonies using MPN method.71

qRT-PCR verification of environmental bacteria species

Pathogen status of presumptive Salmonella, Shigella, Campylobacter, Listeria and E. coli spp. isolated from household soil, hands, toys and neighbourhood soil and water is tested by qRT-PCR of DNA isolated from bacterial colonies growing on tryptic soya agar/selective agar. The qRT-PCR primers and probes and the PCR cycling conditions are identical to those used on TAC to ensure comparability of detection information.


A biorepository generated through this study is preserving all human and animal faeces specimens, bacterial isolates from selective culture of environmental samples, and the non-selective buffered peptone broth primary enrichment cultures of those environmental samples for validation of results as well as for future research. Primary stool specimens and bacterial isolates from the environment are stored in an ultralow freezer (−76°C) at the stated central labs and DNA/RNA/cDNA isolated from all specimens is stored in an ultralow freezer (−76°C) at the University of Iowa.


Data summaries and visualisations will be constructed for our socioeconomic, infrastructural, behavioural, spatial, environmental, zoonotic and human data per our study design: by city and low-income versus middle-income neighbourhood type and by age group. The form of these summaries and data visualisations will be determined by data type and distribution.

We will perform latent class analysis to cluster households into homogeneous groups based on infrastructural, behavioural and environmental variables in order to discover groups with similar exposure profiles. We will compute estimates of diarrhoea incidence and 2-week prevalence and perform a binary regression using the cumulative log–log link function with an offset of time between observations to determine the association of diarrhoea incidence and the presence of various pathogens.

We will use the Livetrak structured observation data to provide modellers realistic distributions of a wide range of infant and caregiver behaviours. For each behaviour of interest, we will fit Poisson, NB, zero-inflated Poisson, and zero-inflated NB distributions to the data disaggregated by age, neighbourhood, both, or neither65; that is, we will use the observed count of the behaviour as the outcome using the log of the time of the observation period as an offset and using as covariates either age, neighbourhood, both or neither (intercept only). We will then use the Deviance Information Criterion—which chooses the candidate model closest to the true data generating process—to choose for each behaviour the best fit among these 16 (4 distributions × 4 sets of covariates) models. While we will fit distributions to each behaviour, we also acknowledge that certain behaviours lend themselves to joint consideration. For example, when the aim is to understand how fomites lead to infection, it is more important to know how many times in a day a child touches the fomite and then, without washing hands, touches her/his mouth; hence, count outcomes related to fomites will include the ordered sequence of ‘touch object’ followed by ‘touch mouth’ while we will not count the ordered sequence of touch object followed by ‘wash hands’ followed by touch mouth.

For animal faeces, household environmental, and public domain soil and water samples, we will compute descriptive statistics on point prevalence and diversity. For infant faeces samples, we will additionally compute descriptive statistics on incidence and 2-week prevalence. For each source of pathogen data, we will construct co-occurrence networks, and fit latent space network models adjusting for overall pathogen-specific prevalence/sparsity, and discovering patterns of pathogens which tend to co-occur more often than random chance.72

As part of this study, we are developing novel statistical methods to perform longitudinal analysis of multipathogen data in order to better understand risk factors on incidence and prevalence of pathogens and how pathogens interact with each other in the environment and within hosts. In particular, we will explore how infections in children are associated with neighbourhood income level, environmental risk factors obtained from the household survey, spatial range obtained through the Tractive GPS devices, behavioural rates obtained through the Livetrak data, and domestic animal pathogen infection.

We will build an agent-based model (ABM) designed to explore the many pathways in which children are infected with one or more enteric pathogens. This ABM will compare the distribution of daily pathogen dose to infants from food, water, caregiver hands, household objects, household and public domain soil, surface water and flooding. Additionally, we will use this ABM to evaluate potential interventions which will be determined in response to our findings on relative importance of exposure pathways. These potential interventions will include many counterfactuals which may not be possible to see directly from the data due to lack of variability in observed conditions in these specific settings. While the use of a detailed ABM—where we carefully model the mechanisms behind each pathway—ameliorates this issue, it will remain a limitation of our study. The ABM will be supported by the survey, infant and animal faeces samples, structured observations (which provides important information on both exposures within the household and through neighbourhood level exposures), GPS-measured spatial ranges, and household and public domain environmental samples. Remaining data gaps will be filled from either prior studies by this team73 or from the broader extant literature. Estimation and inference of the daily dose and potential intervention effects will be done in the Bayesian paradigm where it is relatively easy to carry forward uncertainty from the submodels (eg, contamination of milk) into the final posterior distribution describing the plausible dose or intervention effects.

Patient and public involvement

Community and stakeholder engagement by the research team occurs prior to study onset in each city to gather feedback from community leaders, health workers, policy-makers and other academic and non-governmental stakeholders in each city. Suggestions for improvement or concerns about risks to participants are used to improve study protocols. Feedback also contributes to a sense of ownership of the study by local policy makers and practitioners, ensuring that results will be received and potentially implemented and sustained after the study to address local health challenges. Community and stakeholder engagement is repeated at the end of the study to share initial learning from the studies, and a final end-of-study meeting at the end of the statistical modelling and virtual lab design stage to disseminate final recommendations. To ensure research is quickly and effectively translated into policy, the research team is conducting a landscaping review of Government of Kenya policies across sectors, including urban housing and development, health and agricultural to identify opportunities for improving existing policies, implementing policies that address gaps, or linking policies between sectors for improved planning and monitoring.

What this study will contribute

This study will describe the probability that different faecal sources and exposure pathways contribute to enteric pathogen transmission in urban low-income and middle-income settings in Kenya. Based on the new learning, we will be able recommended new or modified strategies for combining interventions to expedite enteric disease control for settings like urban Kenya. Unique elements of the study as described in our Rationale include a focus on pathomicrobial community (PATHOME) dynamics for examining pathogen transmission patterns across exposure pathways and rigorous microbiological tools for characterising the pathome. We recruit middle class households with high basic standards of development and use a 14-day repeat sampling cohort design with integrated household and neighbourhood microbial exposure assessment. Our combined behavioural observation and newly developed geotracking methods provide rigorous new evidence about spatial dimensions of exposure, in terms of where infants experience environmental exposures and frequency and duration of localised interaction with domestic animals. Our One Health integrated approach includes documentation of urban socioeconomic development as well as domestic animal health and their potential to serve as zoonotic vectors through direct or indirect interaction with humans. The data generated through this cohort study will be used to construct new statistical and agent-based models to identify spatial–temporal dynamics in enteric pathogen transmission in LMICs and the role of societal development in preventing transmission. An interactive virtual laboratory will be developed to predict how different improvements in household and neighbourhood development alone and in combination could influence endemic enteric disease burden in Kenya or other urban settings. The overarching goal of this virtual laboratory is to generate targeted recommendations for improving policy and practice, especially in terms of preventive options, and provide a system for refining those interventions over time in the context of ongoing societal changes. We will publicly share the large datasets we collect, thereby providing an invaluable asset to other modelling groups working on reducing incidence rates of enteric pathogen infections in children in LMICs.

Ethics and dissemination

This study conducts research on both human and animal health outcomes. The protocols for human subjects’ research for the PATHOME study were approved by Institutional Review Boards at the University of Iowa (ID—202004606), AMREF Health Africa (ID—ESRC P887/2020), and a permit obtained from the Kenyan National Commission for Science Technology and Innovation (ID# P/21/8441). The study is also registered on (Identifier: NCT05322655). Protocols for research on canines, felines, avians and ruminants were approved by the UI Animal Care and Use Committee (ID 0042302) in accordance with the Guide for the Care and Use of Laboratory Animals, NIH Publication No. 85-23 (2010). Details on human and animal research ethical considerations and protocols are reported in online supplemental material. The entire research team took refresher training on ethical principles on protecting human research participants.

Ethics statements

Patient consent for publication


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Contributors KKB and DS conceived the study and secured funding. KKB drafted the epidemiological study design, with input from DS, SS, AZ and BM. SS and PB designed field protocols and survey, behavioral, diarrhoea calendar and anthropometry tools, with support from AZ, AO and KKB. KKB and FDG designed field sampling and laboratory protocols, with input from BO, JA, CSA, AGK, and CO. BO, JA, CSA and CO implemented culture-based laboratory protocols, and FDG and AGK implemented molecular and genomic protocols. PB, AO, SS, DS, and KKB designed geotracking protocols. SS, IKT and PB supervised data collection. PB, SS, SG and DS oversaw data management and quality control. AZ, SS and CO facilitated human subjects’ and animal research approvals and stakeholder engagement in Kenya. KKB oversaw human subjects and animal research protocol approvals in Iowa. BM conducted policy landscape analysis. KKB and FDG oversaw enteric pathogen analysis approach. DS oversaw the statistical and agent-based analysis of the epidemiological data. KKB wrote the manuscript, with input from all authors.

  • Funding The PATHOME study is funded by the National Institutes of Health Fogarty Institute Grant Number 01TW011795 to University of Iowa. The first draft of the proposal was submitted to the National Science Foundation Evolution and Ecology of Infectious Disease mechanism in November 2019 with notice of award and study launch occurring in July 2020.

  • Disclaimer The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were involved in the design, or conduct, or reporting, or dissemination plans of this research. Refer to the Methods section for further details.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.