Article Text

Original research
Geographically skewed recruitment and COVID-19 seroprevalence estimates: a cross-sectional serosurveillance study and mathematical modelling analysis
  1. Tyler Brown1,2,3,
  2. Pablo Martinez de Salazar Munoz2,
  3. Abhishek Bhatia4,
  4. Bridget Bunda1,
  5. Ellen K Williams5,
  6. David Bor3,6,
  7. James S Miller7,
  8. Amir Mohareb1,3,
  9. Julia Thierauf3,8,
  10. Wenxin Yang8,
  11. Julian Villalba1,3,8,
  12. Vivek Naranbai3,9,
  13. Wilfredo Garcia Beltran3,8,
  14. Tyler E Miller3,8,
  15. Doug Kress10,
  16. Kristen Stelljes10,
  17. Keith Johnson10,
  18. Dan Larremore11,
  19. Jochen Lennerz3,8,
  20. A John Iafrate3,8,
  21. Satchit Balsari3,4,
  22. Caroline Buckee2,
  23. Yonatan Grad2,3
  1. 1Infectious Diseases Division, Massachusetts General Hospital, Boston, Massachusetts, USA
  2. 2Center for Communicable Disease Dynamics, Harvard University T H Chan School of Public Health, Boston, Massachusetts, USA
  3. 3Harvard Medical School, Boston, Massachusetts, USA
  4. 4François-Xavier Bagnoud Center for Health and Human Rights, Harvard University, Boston, Massachusetts, USA
  5. 5Massachusetts General Hospital, Boston, MA, USA
  6. 6Department of Medicine, Cambridge Health Alliance, Cambridge, Massachusetts, USA
  7. 7Global Medicine Program, Massachusetts General Hospital, Boston, Massachusetts, USA
  8. 8Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts, USA
  9. 9Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
  10. 10City of Somerville, Somerville, Massachusetts, USA
  11. 11BioFrontiers Institute, University of Colorado Boulder, Boulder, Colorado, USA
  1. Correspondence to Tyler Brown; tsbrown{at}


Objectives Convenience sampling is an imperfect but important tool for seroprevalence studies. For COVID-19, local geographic variation in cases or vaccination can confound studies that rely on the geographically skewed recruitment inherent to convenience sampling. The objectives of this study were: (1) quantifying how geographically skewed recruitment influences SARS-CoV-2 seroprevalence estimates obtained via convenience sampling and (2) developing new methods that employ Global Positioning System (GPS)-derived foot traffic data to measure and minimise bias and uncertainty due to geographically skewed recruitment.

Design We used data from a local convenience-sampled seroprevalence study to map the geographic distribution of study participants’ reported home locations and compared this to the geographic distribution of reported COVID-19 cases across the study catchment area. Using a numerical simulation, we quantified bias and uncertainty in SARS-CoV-2 seroprevalence estimates obtained using different geographically skewed recruitment scenarios. We employed GPS-derived foot traffic data to estimate the geographic distribution of participants for different recruitment locations and used this data to identify recruitment locations that minimise bias and uncertainty in resulting seroprevalence estimates.

Results The geographic distribution of participants in convenience-sampled seroprevalence surveys can be strongly skewed towards individuals living near the study recruitment location. Uncertainty in seroprevalence estimates increased when neighbourhoods with higher disease burden or larger populations were undersampled. Failure to account for undersampling or oversampling across neighbourhoods also resulted in biased seroprevalence estimates. GPS-derived foot traffic data correlated with the geographic distribution of serosurveillance study participants.

Conclusions Local geographic variation in seropositivity is an important concern in SARS-CoV-2 serosurveillance studies that rely on geographically skewed recruitment strategies. Using GPS-derived foot traffic data to select recruitment sites and recording participants’ home locations can improve study design and interpretation.

  • COVID-19
  • epidemiology
  • public health
  • statistics & research methods
  • information technology

Data availability statement

Data are available upon reasonable request. Data collected in this study are available upon reasonable request from the authors.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Data availability statement

Data are available upon reasonable request. Data collected in this study are available upon reasonable request from the authors.

View Full Text

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • CB and YG are joint senior authors.

  • Twitter @abhibhatia08, @yhgrad

  • Contributors TB designed, planned and implemented the study; conducted data analysis and wrote and revised the manuscript; and is responsible for the overall content of the manuscript as guarantor. YG, CB, AJI and SB designed, planned and implemented the study; supervised data analysis, and wrote and revised the manuscript; PMSM designed, planned and implemented the study and wrote and revised the manuscript. AB, BB, EKW, DK, KS, KJ, JL and DB designed and planned the study; JSM, AM, JT, WY, JV, VN, WGB and TEM implemented the study. DL conducted data analysis and wrote and revised the manuscript.

  • Funding This work was supported by the Andrew and Corey Morris-Singer Foundation, National Cancer Institute at the National Institutes of Health (U01CA261277) and the National Institute of Allergy and Infectious Diseases at the National Institutes of Health (T32AI007061 to TB and T32AI007433 to AM). This project has been funded in part by contract 200-2016-91779 with the Centers for Disease Control and Prevention.

  • Map disclaimer The inclusion of any map (including the depiction of any boundaries therein), or of any geographic or locational reference, does not imply the expression of any opinion whatsoever on the part of BMJ concerning the legal status of any country, territory, jurisdiction or area or of its authorities. Any such expression remains solely that of the relevant source and is not endorsed by BMJ. Maps are provided without any warranty of any kind, either express or implied.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were involved in the design, or conduct, or reporting or dissemination plans of this research. Refer to the Methods section for further details.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.