Article Text


Stanford's Outcomes Research in Kids (STORK): a prospective study of healthy pregnant women and their babies in Northern California
  1. Catherine Ley1,
  2. Maria de la Luz Sanchez1,
  3. Ankur Mathur1,
  4. Shufang Yang1,
  5. Vandana Sundaram2,
  6. Julie Parsonnet1,3
  1. 1Division of Infectious Diseases and Geographic Medicine, Stanford School of Medicine, Stanford, California, USA
  2. 2Quantitative Sciences Unit, Stanford School of Medicine, Stanford, California, USA
  3. 3Division of Health Research and Policy, Department of Medicine, Stanford School of Medicine, Stanford, California, USA
  1. Correspondence to Dr Catherine Ley; cley{at}


Purpose Stanford's Outcomes Research in Kids (STORK) is an ongoing prospective cohort of healthy pregnant women and their babies established to determine the effect of infectious diseases on weight, linear growth and immune system development during childhood. Additionally, a nested randomised intervention of household and personal cleaning products tests the effects of the microbicides triclosan and triclocarban on these outcomes and incidence of infection.

Participants Healthy pregnant women were identified and enrolled primarily at public clinics; their babies, enrolled shortly after birth, are followed to age 36 months. Automated weekly surveys assess daily health status, infectious disease symptoms, healthcare provider visits and antibiotic use, in the mother during pregnancy and the baby once born. At 4-monthly household visits, information and samples are collected from the mother (urine, stool, saliva, skin swab), the baby (blood by heel/toe stick, urine, stool, saliva, skin swab) and the household (environmental swabs). Annual blood samples are obtained by venipuncture (mother and baby). Medical charts are abstracted for allergy and infectious illness in the mother during pregnancy and the baby.

Findings to date From 7/2011 to 2/2015, 158 mothers were enrolled at approximately 20 weeks gestation; 127 babies were enrolled. Two-thirds of mothers are Hispanic, one-third are non-US born and one-third speak primarily Spanish; mean years of education is 13 (SD 6.2) years. Households have on average 4.5 residents. Most households (97%) were randomised to participate in the intervention. Completion of weekly surveys (86%) and follow-up (75% after 14 months) is excellent in this young, mobile population; collection of samples is ongoing with thousands of specimens stored.

Future plans Enrolled babies will be followed until age 36 months (last anticipated visit: 07/2018) with medical chart review completed soon thereafter. All epidemiological information and samples will be available for collaborative hypothesis testing.

Trial registration number NCT01442701; Pre-results.

Statistics from

Strengths and limitations of this study

  • This birth cohort is unique: the population is ethnically and socioeconomically quite diverse; mothers are involved early in pregnancy; and extremely granular data are collected, including daily health information, medical chart details and a wide variety of frequently obtained biomaterials from the child, mother and household.

  • Although the sample is small (127 babies), data collection is intensive. Because of the small sample size, we will be unable to independently assess the influence of attributes that occur at low frequency (eg, black race). However, because many data elements are collected frequently (eg, health data, anthropometrics, immunological and microbiologic data) we have ample sample size to understand complex longitudinal biological processes.

  • This cohort is an excellent resource for exploring early childhood growth, physical, metabolic and immunological development, and infection from in utero to age 3 years, and provides a collaborative framework not only for epidemiological studies but also for mechanistic studies on infection and its longer term consequences.


Globally, the prevalence of both obesity (a body mass index (BMI) greater than 30 kg/m2) and overweight (25≤BMI<30 kg/m2) has increased dramatically over the past half-century. In the USA, obesity in adults has nearly tripled, from 13.4% in 1962 to 35.3% in 2012.1 Childhood overweight—obesity has only recently been defined for children—has increased at an even higher rate over the same period, from approximately 4% in the 1960s to almost 17% currently.2 ,3 This rapid rise in obesity has prompted much speculation into its causes, with research focusing primarily on the obvious: either increased caloric intake or decreased energy expenditure. The majority of weight increase in the USA, however, occurred starting in the 1980s, when increased intake and decreased expenditure were already very prevalent. Additional hypotheses to understand this phenomenal weight gain are required.4

Historically, rapid changes in population health have been due to the elimination or introduction of infectious pathogens.5 Interestingly, in the past three decades, many infectious diseases have been disappearing from Western society, due to vaccines and improved vaccination practices, widespread antibiotic use, expanded medical care access, and, potentially, ubiquitous microbicide utilisation in household and personal cleaning products (HPCPs), particularly triclosan and triclocarban (together, TCs).6 Infections cause alterations in inflammatory cytokines and adipocytokines, raise metabolic rate and may influence the colonising microbiota that vary with weight.7 ,8 Childhood infections—or their absence—may thus have an influence on long-term weight. A decline in infectious disease incidence has also been posited to result in delayed immune system maturation and a rise in paediatric allergy, a phenomenon coined the ‘hygiene hypothesis’.9

To determine the effect of infectious disease burden on growth—weight gain in particular—and immunological development during infancy and early childhood, we established a rolling prospective cohort of healthy pregnant women and their subsequent baby followed until 36 months of age, called Stanford's Outcomes Research in Kids (STORK). Additionally, a randomised intervention of TC-containing HPCPs was nested within this cohort to examine the effect of TCs on childhood infection, growth and allergy.

Cohort description


STORK participants were recruited through their attendance at obstetric clinics or by self-referral. Principle clinics were the Lucile Packard Children's Hospital (LPCH; Stanford, California, USA) and the Valley Health Center at Tully (Santa Clara Valley Medical Center (SCVMC), San Jose, California, USA); these clinics serve a high proportion of Hispanic families and families of lower socioeconomic status (SES). Printed information about the study was available in clinic waiting rooms and on flyers posted on the Stanford University campus. If a woman expressed interest, STORK study personnel screened her for eligibility and invited her to participate in the cohort.

An interested participant was eligible if she spoke and read either English or Spanish, was aged 18–42 years, was before 36 weeks gestation of a low-risk pregnancy with a single fetus, was not known to have non-gestational diabetes or other endocrine conditions, was willing to provide biological samples and access to medical records for herself and her infant, and was intending to remain in the San Francisco Bay Area for the next 15 months. At enrolment, a comprehension questionnaire and consent was administered at a baseline home visit convenient to the participating mother.

Enrolled mothers were also invited to participate in a nested randomised trial of the effect of TC-containing HPCPs on infectious disease incidence in the index baby. Participants were provided commercially available HPCPs (liquid and bar soap, toothpaste, dishwashing liquid), all of which either did or did not contain TCs, as needed for 1 year. Assignment to study arm followed a blocked randomisation using the biased coin method,10 so that the intervention study arms contained an equal proportion of first-born infants from each enrolment site. After the baby's first birthday, these products were no longer supplied.

Fathers were also invited to participate in the cohort, with consent provided at any household follow-up visit. Consent for the baby's enrolment was provided at the first household visit after delivery.

Compensation for participation includes for every household visit a $10 gift card per participant to one of several local stores. Participants additionally receive a $25 gift card for every 16 weekly surveys completed and a $20 gift card at the annual blood draw (see below). For those enrolled in the intervention, after the fourth baby visit when HPCPs are no longer provided, an additional $20 is provided per visit.

Data collection

Household questionnaires

Information collected from mothers at the baseline household visit included demographic characteristics of the father and of household members; household characteristics including presence, type and number of pets, and type of vermin; the mother's use of cleaning products or chemicals at her work outside of the home; and maternal bathing habits (figure 1, see online appendix table 1). The interviewer additionally performed an objective assessment of household cleanliness using a modified version of the Environmental Cleanliness and Clutter Scale.11

Figure 1

Schedule of assessments.

Information ascertained at each subsequent four-monthly household visit includes updates to household members or the residence (see online appendix table 1). Once the baby is born, questions include types of foods given to the baby in the prior 7 days, number of visible teeth, co-sleeping and childcare arrangements, and any clinical diagnoses made by a healthcare provider, including allergy information.

All household questionnaire information is collected using an electronic tablet (IPad; Apple, Cupertino, California, USA) with data entry forms constructed in IFormBuilder (Herndon, Virginia, USA). Data are synced with IFormBuilder when back online at the Stanford University campus and then transferred in a weekly batch process using SAS V.9.4 (Cary, North Carolina, USA) into Stanford University's instance of REDCap12 for cleaning, storage and management.

Weekly surveys

A short health status survey (see online appendix table 2) is administered weekly by automated telephone or email survey, as selected by the participant. Information ascertained by self-report from the mother during pregnancy included any infectious disease symptoms in the prior 7 days, the duration in days of these symptoms and antibiotic use. If no infectious disease symptoms were reported, questions instead asked about recent behaviours (amount of sleep, stress and physical activity) to ensure that participants who reported being ill spent equal time answering the survey as those who reported being healthy.

Once the baby is born, starting in the third week after birth, the weekly automated survey pertains to the health status of the baby as assessed by the mother (see online appendix table 2). For sick babies, follow-up questions ask about infectious disease symptoms, their durations in days, and use of healthcare services and antibiotics; for healthy babies, questions are about dietary intake and hours of sleep in the prior 24 h. When a mother reports antibiotic use, a confirmatory call is conducted.

The weekly automated surveys are administered using Precision Polling (Survey Monkey, Palo Alto, California, USA) if by telephone and Qualtrics (Provo, Utah, USA) if by email. Each survey is available in English and Spanish, with the participants receiving the survey in their preferred language; surveys are additionally age-specific for the baby (0–4, 5–12 and 13–36 months). Automated telephone calls are placed every 2 h both Tuesday from 9:30 to 17:30 and Wednesday from 11:30 to 19:30 until answered. For those participants who prefer to be contacted by email, the appropriate survey is sent every Tuesday, Wednesday and Thursday at 7:30. Non-respondents are called personally by STORK staff from Thursday until the following Monday. As above, data are transferred from the Precision Polling or Qualtrics databases into REDCap using SAS.

A validation study was performed during the first year of data collection to verify that information acquired by automated survey was comparable to that acquired by an interviewer. A subsample of respondents was contacted within 24 h of the automated survey, with responses compared.

Chart review

A retrospective chart abstraction from the mother's visits to the clinic during pregnancy is performed to ascertain any diagnoses of infectious conditions, gestational diabetes or hypertension, weight gain during pregnancy, and any medication use including antibiotics. Once the baby is born, chart reviews from the hospital record and from the baby's visits to medical care providers, either well-baby or other, are performed to ascertain: birth weight, length and Apgar scores and any antibiotic use during or after delivery; all weight length/height measurements, vaccinations provided, any diagnoses of infectious or other conditions, and medication use including antibiotics. Chart abstraction data are entered into Medrio (San Francisco, California, USA) using double data entry.


At all household visits, environmental samples are collected from the residence, and biological samples are collected from the mother and the baby once born. From the residence, samples include kitchen counter and drain swabs, a piece of kitchen sponge, and a swab from under the baby's sleeping location. From the mother, samples include blood via venipuncture (at baseline and annually; 5 cc), skin swabs of the arm and behind the ear, saliva (up to 2 mL), urine (10 mL) and stool (10–25 g); a vaginal swab is collected at the participant's 36 weeks gestation appointment at the clinic, when the mother undergoes routine testing for group B Streptococcus. Cord blood (5 cc) is collected at delivery when possible. From the baby, samples include blood via heel stick (or big toe once walking), skin swabs from the arm and behind the ear, saliva (1 swab), urine (5–15 mL), and stool (10–25 g). Urine is collected via a sterile collection bag. A blood draw (5 cc) is performed by a paediatric phlebotomist annually. Supplemental monthly stool samples are sent via FedEx.

All specimens are either aliquoted (urine, blood) and stored or simply stored (stool, skin swabs, saliva, environmental samples and swabs) at −80°C until processed.

Statistical methods

The primary objectives for STORK are to determine: (1) the association between infectious disease frequency as measured by reported sick days from in utero to age 36 months, and growth, as measured by weight-for-age and weight-for-height Z scores; and (2) whether exposure to TCs decreases infectious disease incidence. Secondary objectives are to assess the associations between infection, microbial diversity, allergy and the developing immune system response.

Analyses will be performed in SAS V.9.4 (Cary, North Carolina, USA) or R (R Development Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, 2008. ISBN: 3-900051-07-0,

Participant disposition

From 7/2011 to 2/2015, a total of 1675 women were approached at the participating clinics or called in to the study office and were invited to participate in the cohort with screening at a time of their choosing (figure 2). Of these, 155 (9.3%) declined the invitation to hear about the study: these women were more likely to speak English or a language other than Spanish (p<0.001). A total of 619 of the approached women (37.0%) did not meet screening criteria and were considered ineligible for the study. More English than Spanish speakers did not meet screening criteria (64.4% vs 58.9%, p=0.05). Common reasons for not meeting screening criteria included, either alone or in combination with other factors, a high risk pregnancy (28%), a history of thyroid disease (11%) or diabetes (10%), or plans to move out of the area (8%); additionally, a high proportion of women were lost to subsequent follow-up (10%). The remaining 902 (53.9%) women met screening criteria; of these, 124 (13.7%) declined further contact and 778 (86.3%) agreed to be contacted for eligibility screening. Of these women screened for eligibility, 364 (46.8%) were excluded, primarily due to gestational age greater than 36 weeks by the time they were contacted (14%) or loss to follow-up (80%). Of the 414 remaining eligible women, 256 (71.5%) declined participation—primarily because of the time commitment (31%), home visits (16%) and baby blood draws (12%)—and 158 (38.2%) were enrolled in the cohort, for an overall recruitment rate of 9.4%.

Figure 2

Participant disposition.

Findings to date

The 158 enrolled mothers were on average 30 years of age (SD 6.2 years; range 18–42) and at approximately 20 weeks gestation at their baseline interview (table 1). A third of enrolled mothers preferred to speak Spanish, two-thirds identified as Hispanic and just over one-third were born outside of the USA. Over half of mothers had completed high school or equivalent; most lived with their child's father. Households had on average 4.5 residents and were approximately four rooms in size; approximately one-third contained pets (primarily a dog) and most were kept very clean (table 2). Baseline samples were collected from 100% of enrolled participants and their households.

Table 1

Baseline characteristics of enrolled mothers and fathers (median (Q1–Q3) or N (%))

Table 2

Baseline characteristics of enrolled households (N=158; N (%)) or median (Q1–Q3)

Of the 158 enrolled cohort members, 154 (97.5%) were also randomised to the intervention, with 4 choosing not to be randomised (figure 2). The reason for not enrolling in the intervention was the desire to continue use of current HPCPs (100%). Distribution of first-time mothers by recruitment site was similar across the three study arms (33%, 30% and 25% for the TC, the non-TC and the non-randomised arms, respectively, p=0.89).

Results of the validation study comparing automated to interviewer-collected responses to the weekly survey showed that, of 24 mothers and 2 babies, within the prior week, 31% had been sick (7 mothers, 1 baby). Reported health status was 100% concordant between automated surveys and in-person calls. A total of 12 questions overall (5%) had discordant responses; of these, 9 were from 7 healthy mothers (4 in current stress level, 4 in hours slept the prior 24 h and 1 in physical activity level), 1 was from a sick mother (in number of days of cold/runny nose), 1 was from a healthy baby (in hours slept in the prior 24 h) and 1 was from a sick baby (in days of cold/runny nose).

A total of 72 fathers consented to provide a weight measurement. To date, 127 babies have been born and enrolled in the cohort. Reasons for withdrawal provided by the 31 (19.6%) mothers who chose not to enrol their baby included moving out of area (16%), too great a time commitment (16%), unwillingness to provide blood (7%), no reason (10%), insufficient compensation (3%), a miscarriage (3%), illness (3%) and loss to contact (42%). A total of 96 (76%) babies have either completed the first year of follow-up or are on schedule to do so, with a median of 86% (Q1–Q3: 71–92) of weekly surveys completed per enrolled baby. Currently, 39 babies have reached age 36 months and completed the study.

Strengths and limitations

Although both acute and chronic infections and the timing of their occurrence have been linked to presence or absence of chronic disease later in life,8 the epidemiology of their acquisition in young children remains largely unknown. Furthermore, the extraordinary change in infection acquisition over the past 50 years may well have fundamentally altered how children grow and develop. Designed to explore these questions, the STORK study is an epidemiologically well-characterised and data-intensive cohort of 158 healthy pregnant women and 127 paired babies followed up to age 36 months in the San Francisco Bay Area. It includes stable families from diverse racial and ethnic groups across a wide socioeconomic spectrum, with overselection of Spanish-speaking families and the underserved; it collects detailed follow-up data on illnesses and other behaviours over 3 years using weekly automated surveys, four-monthly household visits and six-monthly medical chart reviews. In addition to health data, the study collects thousands of biological samples from mothers and babies, and environmental samples from households over these same extended time periods. Because of this plethora of data, our cohort has extraordinary depth and will be a unique resource for exploring early childhood growth and development and infection.

None of the infectious disease birth cohorts currently listed at explore a wide range of infectious conditions in such a diverse population. One cohort study in Brisbane, Australia, includes 128 children and collects upper respiratory and fecal samples, as well as daily symptoms, to investigate viral causes of upper respiratory infection (URI) and diarrhoea in children in the first 2 years of life.13 With follow-up for only 2 years, the population lacks socioeconomic and racial/ethnic diversity, and outcomes are restricted to description of the infectious diseases themselves, rather than growth or chronic disease. A European birth cohort registry lists 87 large birth cohorts worldwide, encompassing close to half a million children (, and of these, only 9 include infectious diseases as important exposures. Most recently, a feasibility study describes a potential German cohort that will examine reported infections with associated specimen collection.14 None of these studies combine as frequent and as varied specimen collection, as extensive clinical information, and as diverse a racial and ethnic population as STORK.

Participation of households in our optional randomised intervention was extremely high, at a rate of 97%. We had chosen to nest this trial within our prospective cohort in part due to current and local interest in TC exposure. With this study, we will be able to explore the efficacy of TC-containing compared with plain soaps in preventing reported illness; this analysis will be particularly timely in the light of current interest in TCs by the United States Environmental Protection Agency (EPA) and the United States Food and Drug Administration (FDA;

Our use of a weekly survey to report daily health and illness status over years provides a unique view of each baby's infancy, with no other studies providing such detailed information to date. Studies that use medical records alone to explore childhood conditions such as URI or diarrhoea tend to underestimate their prevalence, as many episodes are considered insufficiently severe to warrant medical attention. Our ability to compare accuracy of survey findings against medical records also provides a unique opportunity for automated survey validation: few such studies are currently in the literature15 and none validate reporting accuracy across time.

Limitations of this cohort include its width: due to the sample size, we can provide neither comprehensive coverage of racial/ethnic differences in the USA nor delineate the diversity of infectious diseases. We will certainly have sufficient sample size to examine infection rate differences between non-Hispanic whites, Hispanics and Asians, between SES groups and between native and foreign-born mothers and their children; for other groups, however, such as blacks, we may not have sufficient sample size to make valid comparisons. Direct measures of SES are not assessed; indirect measures though, such as maternal education level and household crowding (which both vary widely), will be used to explore how SES impacts incidence and disease burden in infants. With respect to infections, sample testing will be able to define which are most likely to have long-term consequences on babies' growth, metabolism and immunity. Many infections, though, will be too infrequent to justify testing samples from all children and still others remain unknown. Fortunately, as data accumulate, we anticipate that, through assessments of microbiome and virome, the development of new technologies, and with judicious management of specimens, we will be able to develop a much broader picture of children's microbial exposures in early life. A final limitation of this cohort is the low recruitment rate with only 18% of those meeting screening criteria enrolled. Given though the frequent assessments at the home, multiple sample collection and long length of follow-up, it is a testament to the amazing study staff that the retention rate has been excellent.

In summary, the STORK cohort is a priceless, in-depth data set covering both in utero and infancy that can provide a framework for collaboration not only for epidemiological studies on infection and growth, but for a wide spectrum of mechanistic studies on infection and the longer term consequences of childhood infection.


We intend to make data and specimens broadly available to other investigators, so they can be maximally used. We will adhere to the National Institutes of Health (NIH) Grants Policy on Availability of Research Results, Publications, Intellectual Property Rights, and Sharing Biomedical Research Resources. The data will be retained securely in a repository at the Stanford Data Center and will be made available to any investigator working at an institution under a Federal Wide Assurance.

Owing to wide interest in these data, we are currently seeking additional funding to expand this cohort. Our ultimate goal is to have data and specimens from 200 children followed for at least 3 years with their mothers. The final data set will include self-reported demographic and behavioural data from interviews with the participants across time, information from medical charts and laboratory results from blood, urine, saliva and stool specimens provided throughout the study period. We will maintain in the repository a list of archived specimens that can be made available. Specimens will be provided to investigators after receipt of an application containing information on study hypothesis, methods, and number and type of specimens requested (because specimens are irreplaceable, we will have two outside consultants familiar with the hypotheses being tested review the applications for merit).


The authors thank Dr Natali Aziz and Dr Jenny Biller for facilitating the study at the clinics, Ting Ma and Mu Shan for their data management expertise, and Thomas Haggerty, Jeannette Noveras and Amanda Merlino for assistance in data collection.


View Abstract


  • Contributors JP initiated and designed the study, obtained approvals, supervised data collection, undertook the analysis and interpretation. CL wrote the first draft of the paper, contributed to the development of the protocol, design, data collection tools, analysis and interpretation. MdlLS, AM and SY assisted in the design of the study, and collection and interpretation of data. VS assisted in the analysis of the data.

  • Funding The study was funded by National Institutes of Health (NIH) grant R01 5R01HD063142-02.

  • Competing interests None declared.

  • Patient consent Obtained.

  • Ethics approval The study was approved by the Institutional Review Boards of Stanford University and the Santa Clara Valley Medical Center.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement The original data and specimens will be made broadly available to investigators working under a Federal Wide Assurance (see Collaborations).

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.