Article Text
Abstract
Objective Since rapid population growth challenges longitudinal population-based HIV cohorts in Africa to maintain coverage of their target populations, this study evaluated whether the exclusion of some residents due to growing population size biases key HIV metrics like prevalence and population-level viremia.
Design, setting and participants Data were obtained from the Rakai Community Cohort Study (RCCS) in south central Uganda, an open population-based cohort which began excluding some residents of newly constructed household structures within its surveillance boundaries in 2008. The study includes adults aged 15–49 years who were censused from 2019 to 2020.
Measures We fit ensemble machine learning models to RCCS census and survey data to predict HIV seroprevalence and viremia (prevalence of those with viral load >1000 copies/mL) in the excluded population and evaluated whether their inclusion would change overall estimates.
Results Of the 24 729 census-eligible residents, 2920 (12%) residents were excluded from the RCCS because they were living in new households. The predicted seroprevalence for these excluded residents was 10.8% (95% CI: 9.6% to 11.8%)—somewhat lower than 11.7% (95% CI: 11.2% to 12.3%) in the observed sample. Predicted seroprevalence for younger excluded residents aged 15–24 years was 4.9% (95% CI: 3.6% to 6.1%)—significantly higher than that in the observed sample for the same age group (2.6% (95% CI: 2.2% to 3.1%)), while predicted seroprevalence for older excluded residents aged 25–49 years was 15.0% (95% CI: 13.3% to 16.4%)—significantly lower than their counterparts in the observed sample (17.2% (95% CI: 16.4% to 18.1%)). Over all ages, the predicted prevalence of viremia in excluded residents (3.7% (95% CI: 3.0% to 4.5%)) was significantly higher than that in the observed sample (1.7% (95% CI: 1.5% to 1.9%)), resulting in a higher overall population-level viremia estimate of 2.1% (95% CI: 1.8% to 2.4%).
Conclusions Exclusion of residents in new households may modestly bias HIV viremia estimates and some age-specific seroprevalence estimates in the RCCS. Overall, HIV seroprevalence estimates were not significantly affected.
- HIV & AIDS
- demography
- statistics & research methods
- epidemiologic studies
- epidemiology
- public health
Data availability statement
Data are available upon reasonable request. Data are available upon request to the Rakai Health Sciences Program (RHSP) (email: datarequests@rhsp.org). Code will be made available upon request to the corresponding author.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
This study leveraged data from a large sample of individuals participating in a population-based cohort study.
Using an ensemble model to predict an outcome maximises efficiency and minimises predictive error.
Prediction using ensemble models may yield moderate model performance for rare outcomes.
The methods used in this research assume that predictors of HIV seropositivity and viremia are the same for the included and excluded survey population.
Introduction
Over the last 30 years, longitudinal population-based cohorts in sub-Saharan Africa (SSA) have provided critical surveillance data on the determinants of HIV acquisition and transmission, and evolving HIV epidemic trends.1 An important challenge these studies face is rapid population growth2–4 often within the context of fixed budgets and resource constraints.5–7 For example, the Rakai Community Cohort Study (RCCS)—established in 1994 in south-central Uganda—has seen a more than 200% increase in population size within its surveillance boundaries over the last 20 years.8 Most of this population increase in the RCCS catchment area is attributable to Uganda’s reduction in all-cause mortality and high fertility rate,2 but some growth is also due to in-migration.9 This rapid population growth may prevent ongoing open cohort studies like the RCCS from representatively sampling the growing target population.
In resource-limited settings, studies may address this challenge in two ways: they may prioritise longitudinal follow-up of existing residents by excluding new residents or they may scale down longitudinal follow-up efforts in order to expand the sampling frame to one that includes new residents. Both options attempt to minimise total survey error,10 but which one is more fitting depends on local context and research priorities. For example, one demographic surveillance site in Burkina Faso prioritised enrolling new in-migrants over the longitudinal follow-up of out-migrants.11 In contrast, a cohort study in Malawi prioritised the follow-up of out-migrants instead of enrolling new in-migrants.12 In the case of the RCCS, rapid population growth and resource constraints led to the decision in 2008 to begin excluding residents of new households in newly built physical structures predominately comprised of in-migrating families.
Exclusion of new households from open population-based cohorts like the RCCS may bias estimates of community-level HIV seroprevalence if in-migrating members of new households have higher or lower rates of HIV than local residents,13 (the latter because of a ‘healthy migrant effect’).14–16 Similarly, population-level estimates of viremia may be biased by the exclusion of new households, and most evidence in SSA suggests worse HIV outcomes for migrants living with HIV compared with non-migrants living with HIV due to gaps in the continuity of care.17–19 Since HIV serostatus and viremia can be in part predicted from sociodemographic data like age, gender, socioeconomic status (SES) and other individual or household-level characteristics,20 21 machine learning can be used to predict the prevalence of these outcomes among excluded groups for which we have some information. However, employing one single learner can result in unreliable results if the learner is weak. Ensemble machine learning algorithms simultaneously implement multiple learners, weighing the most optimal learner(s) highest and securing a more accurate prediction when little is known about which learner would perform best for predicting a given outcome.22 23 Ensemble models have been used for prediction across a range of disciplines, including for public health.24–28
This study aimed to assess and quantify the potential bias in population-level HIV seroprevalence and viremia estimates, caused by exclusion of in-migrating residents arriving into new households in the RCCS surveillance area. To achieve this aim, the first objective was to describe characteristics of residents in new household structures (the excluded population), comparing them to residents in existing household structures (the observed sample). The second objective was to predict HIV seroprevalence and viremia in the excluded population to assess whether their inclusion in the observed sample might alter currently observed values.
Methods
The RCCS
The RCCS is an open population-based HIV surveillance cohort study in the Rakai region of South-Central, Uganda, established in 1994.29 At approximately 18-month intervals, the RCCS conducts a census of all residents, whether permanent or transient and regardless of age, in every household (old and new) within cohort community boundaries. The census collects sociodemographic data on each household member, including age, gender, marital status, residence status, each member’s relationship to the household head and household assets.
Approximately 2 weeks after the census, the RCCS administers a survey interview to each consenting resident aged 15–49 years residing in old households. The survey collects sociodemographic, behavioural, health and HIV service utilisation data and a blood sample for the assessment of HIV prevalence, incidence and HIV viremia among persons with HIV. Approximately 95% of eligible persons present in the household agree to both the interview and the blood draw.20 All participants are offered HIV results and post-test counselling on site. Enrolled participants are eligible for follow-up in future RCCS surveys if they continue to meet study eligibility criteria.
For this analysis, we used data from 33 inland rural and peri-urban communities surveyed in RCCS round 19, conducted between June 2018 and November 2020.
Patient and public involvement
Patients and the public were not involved in the design, conduct, reporting, or dissemination planning of this research.
Household classification
The RCCS defines a household as ‘an individual or group of individuals who eat their primary meals together and live together’.20 Households exist within ‘structures’—the physical building in which members of the household reside. During each census, the RCCS identifies individuals living in newly identified structures who are members of entirely new households not previously censused in the RCCS. Since 2008, these individuals are included in the RCCS household census but are not eligible to participate in the survey.
Based on the household census, residents in the RCCS surveillance area were categorised according to whether they resided in a new or old physical structure and whether they were members of a newly established or old household. Old structures and old households were defined as being recorded on the census since any year prior to 2008. New structures and new households were defined as being recorded for the first time in year 2008 or after. Using new and old categorisations for both household and structure, the four categories of residents in the RCCS surveillance area are as follows:
Residents of old households in old structures: household censused prior to 2008 and members eligible to participate in the survey.
Residents of new households in old structures: household censused in 2008 or after and members eligible to participate in the survey.
Residents of old households in new structures: household censused prior to 2008 and members eligible to participate in the survey.
Residents of new households in new structures: household censused in 2008 or after and members not eligible to participate in the survey.
Since residents of new households in new structures (herein referred to as ‘new household residents’ or ‘residents of new households’) have been excluded since 2008, we sought to compare them with the observed sample of residents in old structures (herein referred to as ‘old household residents’ or ‘residents of old households’).
Residents of old households in new structures were excluded from the analysis because only 339 (56%) of the 601 residents were eligible, present and surveyed (others were absent at the time of survey, had out-migrated, had already been seen at another residence or were ineligible for other reasons at the time of survey like being unable to provide informed consent due to illness). The 339 participating residents comprised 1% of the analytic sample and were similar to other residents of old households with regards to seroprevalence and viremia, so their exclusion was not expected to influence the results.
Primary outcomes
Primary outcomes included population-level HIV seroprevalence and viremia. HIV seropositivity among residents in old households was assessed using a validated three rapid test algorithm, while the probability of seropositivity was predicted—as described below—among residents in new households.30 HIV seroprevalence in the observed sample was calculated as the proportion of the total population who were HIV positive, and for residents of new households, as the mean of predicted probabilities of HIV seropositivity. Individual viremia was defined as having an HIV viral load of 1000 copies/mL or more (ascertained via viral load testing on stored serum/plasma among residents in old households who were HIV seropositive). We assigned a viral load of zero to all residents in old households who tested HIV seronegative. Among residents in new households, the probability of having an HIV viral load of 1000 copies/mL or more was predicted using the model described below. Population-level prevalence of HIV viremia among residents of old households was calculated as the proportion of the total population who were viremic. Among residents of new households, viremia was calculated as the mean of predicted probabilities of being viremic. All outcomes were calculated or predicted by gender (men and women) and age group (15–24 and 25–49 years).
Predictors
Census data were used to assess potential predictors of HIV seroprevalence and viremia. Individual predictors included gender, 5-year age group (starting at 15–19), marital status, relationship to the household head, residence status (permanent, >6 months spent at household/year or transient, <6 months spent at household/year) and new in-migration (migrated into the study area between the last and current survey round). Household-level predictors included region of residence, SES, land ownership, household size, gender of the household head and youth dependency ratio. SES was measured using a weighted nine-item asset-based index based on the availability of household-level assets like radio, electricity and latrines.20 Youth dependency ratio was defined as the number of household members under age 18 divided by the number of members aged 18 and above.31
Descriptive analysis
We first measured the number of residents in new households and the number of new households in each survey round, excluding those who had out-migrated or died by round 19. We then characterised residents in new and old households using the predictors described above, with differences assessed using χ2 tests. Analyses were additionally stratified by age (15–24 and 25–49) and by round in which new household was identified on the census (pre-2018 census or during the 2018 census) (reported in online supplemental tables 1 and 2). Among residents in old households who participated in the survey, each predictor was assessed for its bivariate association with each outcome using a χ2 test of significance (reported in online supplemental table 3).
Supplemental material
To assess whether new in-migrants in new households were different from new in-migrants in old households, the analysis was repeated among new in-migrants only. New in-migrants were individuals who have moved into the RCCS study area between the last survey round (August 2016–May 2018) and this survey round (May 2018–June 2020).
Prediction model
To predict seroprevalence and viremia among residents in new households, an ensemble prediction model was built for each outcome. Ensemble models often employ bagging, boosting or stacking approaches to combine multiple machine learners, maximising performance of the prediction model when little is known about which single learner best predicts a given outcome.32 Super Learner—a stacking-type ensemble prediction model—has been used in a variety of health outcome prediction models and is generally robust to model misspecification.33–36 Performance and architecture of Super Learner has been described elsewhere.37 38 Super Learner uses cross validation to evaluate and weight the performance of multiple machine learning algorithms to build one ensemble model, thus maximising efficiency while minimising predictive error.39 In the first step of 10-fold cross-validation, each Super Learner weighted seven candidate algorithms according to their prediction performance defined by the area under the receiver-operator-characteristic curve (AUC) (weights and descriptions of each algorithm are in the online supplemental table 4). Algorithms in R included gam,40 biglasso,41 xgboost,42 glm and glm.interaction,43 bayesglm44 and ranger.45 We used the default parameters in R for each candidate algorithm. Another 10-fold cross-validation was then used to evaluate the performance of the overall ensemble model. Individuals living with the same household ID were grouped in each ensemble model to account for clustering of observations.
The model was used to predict seroprevalence and viremia probabilities for all residents in new households, and then for men, women, young people aged 15–24 and adults aged 25–49. The predicted prevalence in each of these groups was compared with the empirical prevalence in the observed sample. The predicted prevalence among residents of new households was multiplied by the size of the group to obtain the predicted number of residents who were HIV positive or viremic. These numbers were combined with the number HIV positive or viremic in the observed sample to calculate the total seroprevalence and viremia prevalence for residents in both new and old households.
To account for uncertainty in each individual’s probability assignment for being HIV positive or viremic, everyone’s unknown outcome was determined by simulating a Bernoulli trial with probability corresponding to the predicted probability of the outcome from the model. To account for uncertainty due to both sampling and individual level outcome assignment, the analysis was repeated on 100 bootstrapped samples of residents in new households. We use these prediction intervals among the excluded sample to represent uncertainty in predicted seroprevalence and viremia estimates. We additionally estimated 95% CIs of the observed seroprevalence and viremia values using binomial probabilities of seropositivity and viremia among 100 bootstrapped samples of residents in old households to represent seroprevalence and viremia in the underlying target population of residents in old households.
Results
There were 24 729 census-eligible adults, of whom 2920 (12%) were new household residents and were thus excluded from the RCCS survey (figure 1); 21 208 (86%) of censused individuals were old household residents, of whom 17 339 (82%) were eligible to participate in the survey; among them, 11 942 (69%) were present at the time of the survey. Residents who were not eligible to participate were either minors without parental consent, incapacitated or had already been seen in another household. Of residents in new households, 1358 (47%) were new in-migrants compared with 4139 (20%) of those in old households.
Since 2009, the number of residents in new households increased from 90 to 2920, with the steepest increase occurring between 2017 and 2020 (online supplemental figure 1). As of 2019 (survey round 19), 2108/12 644 (17%) households and 2920/24 729 (12%) residents were excluded from the RCCS survey.
When comparing residents in new households to residents in old households, the two groups differed with regards to all variables except for gender (table 1). On average, residents in new households were more likely than those in old households to be middle-aged, married, the head of household, middle-SES, renters versus owners and from small households with less youth dependents.
When comparing new in-migrants in new households to new in-migrants in old households, the two groups differed with regards to all variables except for marital status and gender of the household head (table 2). New in-migrants in new households were more likely to be men (48.1%) compared with those in old households (41.3%). Compared with those in old households, the new in-migrants in new households were also more likely to be aged 20–29, the household head, renters versus owners and of smaller households (1–2 members) with no dependents.
Predicted HIV seroprevalence and viremia among residents in new households
HIV seroprevalence
The AUC of the seroprevalence prediction model was 0.78 (0.73 to 0.81) (online supplemental table 4). Overall, predicted seroprevalence for new household residents was 10.8% (95% CI: 9.6% to 11.8%) compared with 11.7% (95% CI: 11.2% to 12.3%) in the observed sample (figure 2; data in online supplemental table 5). Inclusion of new household residents in the total estimate for seroprevalence did not significantly change seroprevalence estimates for the total population (11.6% (95% CI: 10.9% to 12.2%)). However, among young people aged 15–24, new household residents had a significantly higher seroprevalence (4.9% (95% CI: 3.6% to 6.1%)) compared with old household residents (2.6% (95% CI: 2.2% to 3.1%)). The opposite was true for older adults aged 25–49, in which new household residents had a significantly lower seroprevalence (15.0% (95% CI: 13.3% to 16.4%)) compared with old household residents (17.2% (95% CI: 16.4% to 18.1%)). While seroprevalence estimates were significantly different between new and old household residents in both age groups, the inclusion of those in new households in total seroprevalence estimates did not significantly alter current estimates.
Viremia
The AUC for the viremia prediction model was low relative to that of the seroprevalence prediction model at 0.71 (0.65 to 0.75) (online supplemental table 4). Inclusion of new household residents, with a predicted viremia prevalence of 3.7% (95% CI: 3.0% to 4.5%), resulted in a 24% higher total viremia prevalence (2.1% (95% CI: 1.8% to 2.4%)) for both new and old household residents compared with the observed sample (1.7% (95% CI: 1.5% to 1.9%)) (figure 3; data in online supplemental table 5). Though viremia among residents of new and old households was significantly different, this difference did not significantly alter current viremia estimates. Similar findings were observed for all subgroups by age and gender, with the comparatively highest viremia estimate predicted for young people aged 15–24 (1.2% (95% CI: 0.9% to 1.7%)) in the total population—50% higher than the viremia prevalence of 0.8% (0.6% to 1.1%) among residents in old households.
Discussion
Longitudinal population-based cohorts have contributed immensely to the HIV response in Africa, but rapid population growth poses significant challenges to open cohorts like the RCCS. While there were few new households in 2008 when they were first excluded, the population in new households has expanded with time. In assessing the potential bias introduced by excluding residents of new households, this study showed that the population in new households substantively differs from the included surveillance sample on various sociodemographic characteristics. These differences did not translate into significantly different HIV seroprevalence or viremia rates overall, suggesting minimal bias at a population level. However, significant differences in viremia and age-specific seroprevalence estimates between new and old household residents warrant further examination.
Currently, the exclusion of residents in new households in the RCCS does not impact overall distributions of sociodemographic characteristics, but the growing number of residents in new households could significantly change the underlying population structure of the RCCS in future years. If the population in new households continues to grow, and this growing population maintains a similar demographic profile of being majority married, middle-aged and with small household sizes, the survey-eligible population may become increasingly unrepresentative of the target population. This ‘new’ population may represent an important demographic group in Uganda: young middle-class married couples or families starting up new households in new geographic areas. Even if overall HIV estimates remain representative of the target population, the decision to expand the sampling frame could be made on the basis of demographic representation.
Considering that all residents in new households were at one point in-migrants—the majority of whom in-migrated to the study area after 2016—biases in the age-specific seroprevalence estimates may illustrate differences among migrant subgroups. Different types of migrants have different levels of HIV risk. Among 15–24 year-olds, there may be a migrant health penalty for those arriving in new households because their seroprevalence was higher than the included sample. Young people in Rakai who migrate have been shown to engage in higher HIV risk behaviours compared with their non-migrant counterparts.46 These risky behaviours often coincide with the transition from school to work, when many mobile young people engage in risky occupations associated with HIV risk, like boda-boda drivers, truck drivers, and bar and restaurant workers.47 Among 25–49 year-olds, the opposite seems to be true: there may be a healthy migrant effect for those arriving in new households because their estimated seroprevalence was lower than the included sample. Since many adults aged 25–49 in new households were married men and the head-of-household, it may be that this group of in-migrants travelled for family or greater economic opportunity and not due to circumstances that independently drive HIV risk.15 16 48
Unlike seroprevalence, viremia rates are projected to be consistently higher across all subgroups in new households, compared with old households, thereby modestly, though not significantly, increasing the rate of viremia in the total population. This finding is consistent with that of other cohorts in SSA,49 and provides evidence for the migrant health penalty, which posits that individuals who migrate experience intrinsic risks that heighten their risk for poor HIV outcomes.17 Mobile populations may face structural barriers to care that lead to treatment interruptions,18 50 which increases risk of viremia and other poor treatment outcomes.51 52 Furthermore, studies suggest that areas with higher-than-average population mobility also demonstrate high community-level viremia, for example, among both residents and non-residents in east African border towns53 and along transport corridors in Namibia.54 However, the impact of viremia among in-migrants on viremia in the community at large should be studied using longitudinal data because high rates of viremia among in-migrants may cancel out with equally high rates of viremia among out-migrants.49 Thus, it may be important to include in-migrants in the sampling frame of ongoing longitudinal cohorts: if in-migrants are systematically excluded from cohorts and they are less likely to be virally suppressed, local HIV prevention programmes may miss an important population whose linkage to care would help to stem onward HIV transmission. To assess bias in the indicators that monitor progress towards 95–95–95 targets, such as viral load suppression,55 future research could assess not just population-level viremia among populations excluded from surveys but also viremia among excluded people living with HIV, if serostatus among individuals in the excluded population is known.
Furthermore, excluding certain in-migrants from surveillance programmes could mean that the wrong inferences are made about the health state and risks for migrant populations. For example, cohorts that only include in-migrants with stronger social ties (eg, new members of existing households) might produce evidence for the migrant experience that does not represent transient, young adult, or working-class migrants. In the case of the RCCS, the cohort may underrepresent migrant men, given that migrant men were more likely to be living in new household structures than old household structures. Since migrant men are more likely to be viremic,56 it is possible that viremia among migrants in the RCCS is higher than would be estimated with current data.
Our study had several limitations. First, the prediction model performance was moderate for seroprevalence and relatively weaker for viremia, which has been observed in other HIV ensemble modelling studies.57 This may be due to the relative lack of data on new residents: for example, sexual behaviours like age-disparate sexual partnerships predict seroprevalence,58 and health behaviours like alcohol use predict viremia,59 and these data were not available for residents of new households. Nonetheless, many of the included predictors have been demonstrated to reliably predict HIV outcomes,56 60 61 and a sub-analysis of RCCS survey data confirmed this (online supplemental table 3). Second, this analysis assumes that the risk factors for HIV seroprevalence and viremia are the same for residents of new households as for residents of old households, but this assumption is difficult to prove. Third, the observed sample may be a biased representation of the target population because some eligible participants were away at the time of the survey. While it was not the aim of this paper to estimate seroprevalence and viremia among the full target population (only to describe the currently observed sample), this is another potentially important source of bias that should be evaluated in future studies. Finally, it is possible that we underestimated or overestimated seroprevalence or viremia among residents in new households because we do not know who would have been present and survey-eligible had the household been included in the survey. In the survey sample, residents in old households are sometimes excluded from the survey because they have died, are incapacitated at the time of the survey, or refuse to respond. If HIV outcomes differ for eligible and ineligible participants, then this could partially explain results.
Conclusions
While we did not find substantial potential biases from excluding new household residents in the RCCS, our results show this population is growing and so observed differences may become more problematic over time. To address potential future biases, longitudinal surveys may choose to better reflect the target population either by expanding the sampling frame or re-weighting the survey-eligible population.62 63 The latter option may be more cost-efficient, but it assumes that the demographic group included in the survey shares the same risk of HIV acquisition and viremia as their excluded counterparts. A substudy with a sample of the excluded in-migrants could be conducted to test this assumption.64
In the context of rapid population growth, open cohorts in SSA may seek to ensure that sampling frames reflect the growing target population. However, the resources required for any changes may not be met by fixed or decreasing budgets. In the context of HIV funding shortfalls,65 HIV-focused cohorts must make difficult decisions. While expanding the sampling frame to cover more of the target population can improve representation of in-migrants, such an effort may divert resources away from effective follow-up of previously enrolled individuals, which can worsen representation of those who are more likely to be lost to follow-up, like out-migrants. These risks must be carefully considered before opting for a survey redesign. Longitudinal population-based cohorts provide critical evidence for the HIV response, and so longitudinal sampling efforts must be preserved. Cohorts should routinely monitor demographic changes in the target population and be transparent about any potential biases introduced by these changes or resulting modifications to study design. Lastly, results from studies that prioritise representativeness should be triangulated with those from studies that maintain longitudinal follow-up to improve overall understanding of the African HIV epidemic.
Data availability statement
Data are available upon reasonable request. Data are available upon request to the Rakai Health Sciences Program (RHSP) (email: datarequests@rhsp.org). Code will be made available upon request to the corresponding author.
Ethics statements
Patient consent for publication
Ethics approval
This study involves human participants and was approved. This study was approved by the Uganda National Council for Science and Technology (approval number HS 540), the Uganda Virus Research Institution Research and Ethics Committee (approval number GC/127/08/12/137), Johns Hopkins Institutional Review Board (approval number IRB-00217467) and the Columbia University Institutional Review Board (approval number IRB-AAAR5428). Participants gave informed consent to participate in the study before taking part.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
Contributors All authors conceptualised the paper, contributed to the investigation and methodology, and reviewed and edited the manuscript. AK led the formal analysis, the visualisation and writing of the original draft. JL, JS, MKG and LWC supported the formal analysis and JL, MKG and LWC supported the writing of the original draft. RS, MW, JSS, SH, FN, TL, AN, JS, GK, JK, LWC and MKG contributed equally to data curation and project administration. As guarantor, AK is responsible for the overall content and conduct of the study. AK had access to the data and controlled the decision to publish.
Funding This work was supported by the National Institute of Allergy and Infectious Diseases (R01AI143333 and R01AI155080) and the National Institute of Mental Health (R01MH115799). The findings and conclusions in this article are those of the authors and do not necessarily represent the official position of the funding agencies. Research by Aleya Khalifa reported in this publication was supported by the National Institute of Allergy And Infectious Diseases (T32AI114398). Larry Chang was supported by the National Heart, Lung, and Blood Institute (R01HL152813), Fogarty International Center (D43TW010557) and the Johns Hopkins University Center for AIDS Research (P30AI094189). Susie Hoffman and John Santelli were supported by the U.S. National Institute of Child Health and Human Development (NICHD) (R01HD091003; Santelli, PI). Susie Hoffman was also supported by the National Institute of Mental Health (P30-MH43520; Remien, PI).
Competing interests None declared.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.