Article Text

Download PDFPDF

Geospatial patterns of human papillomavirus vaccine uptake in Minnesota
  1. Erik J Nelson1,
  2. John Hughes2,
  3. J Michael Oakes3,
  4. James S Pankow3,
  5. Shalini L Kulasingam3
  1. 1Department of Epidemiology, College for Public Health and Social Justice, Saint Louis University, St. Louis, Missouri, USA
  2. 2Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
  3. 3Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
  1. Correspondence to Dr Erik J Nelson; nelsonej{at}


Objectives To identify factors associated with human papillomavirus (HPV) vaccination and to determine the geographic distribution of vaccine uptake while accounting for spatial autocorrelation.

Design This study is cross-sectional in design using data collected via the Internet from the Survey of Minnesotans About Screening and HPV study.

Setting and participants The sample consists of 760 individuals aged 18–30 years nested within 99 ZIP codes surrounding the downtown area of Minneapolis, Minnesota.

Results In all, 46.2% of participants had received≥1 dose of HPV vaccine (67.7% of women and 13.0% of men). Prevalence of HPV vaccination was found to exhibit strong spatial dependence (Embedded Image) across ZIP codes. Accounting for spatial dependence, age (OR=0.76, 95% CI 0.70 to 0.83) and male gender (OR=0.04, 95% CI 0.03 to 0.07) were negatively associated with vaccination, while liberal political preferences (OR=4.31, 95% CI 2.32 to 8.01), and college education (OR=2.58, 95% CI 1.14 to 5.83) were found to be positively associated with HPV vaccination.

Conclusions Strong spatial dependence and heterogeneity of HPV vaccination prevalence were found across ZIP codes, indicating that spatial statistical models are needed to accurately identify and estimate factors associated with vaccine uptake across geographic units. This study also underscores the need for more detailed data collected at local levels (eg, ZIP code), as patterns of HPV vaccine receipt were found to differ significantly from aggregated state and national patterns. Future work is needed to further pinpoint areas with the greatest disparities in HPV vaccination and how to then access these populations to improve vaccine uptake.


This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Strengths and limitations of this study

  • This is the first study to identify factors associated with human papillomavirus (HPV) vaccination at the ZIP code level using statistical models that account for spatial dependence.

  • Study strengths include the large representative sample of 18–30-year olds in Minneapolis, Minnesota, adjustment for factors known to be associated with HPV vaccination, and the use of robust spatial statistical models.

  • This study reveals a gap between local estimation of HPV vaccination and estimates from large national surveillance programmes.

  • Potential limitations include the reliance on self-reported data collected via the Internet, selection bias and the absence of information regarding study participants’ age at vaccination and income.


Human papillomavirus (HPV) is the most common sexually transmitted infection in the USA,1 and is the necessary cause of cervical cancer.2 HPV infections are also associated with other cancers (eg, anogenital and oropharyngeal) as well as genital warts.3 ,4 Since mid-2006, the Advisory Committee on immunisation Practices (ACIP) has recommended routine vaccination of adolescent girls aged 11 or 12 years with the three-dose HPV vaccine series.5 In October 2011, the ACIP extended their recommendation of the quadrivalent vaccine to include boys aged 11 or 12 years old.6 ,7 The ACIP also recommends catch-up vaccination for those aged 13–26 years. However, HPV vaccination uptake has been far lower than expected, with only 57.3% of girls and 34.6% of boys aged 13–17 years and 36.9% of women and 5.9% of men aged 19–26 years receiving at least one dose of the vaccine as of 2013.8 ,9 Despite lower than anticipated vaccine uptake, recently published HPV vaccine serosurvey results show significant reductions in HPV prevalence,10–12 and reductions in HPV-associated cancer incidence of approximately 70% are predicted in the coming decades.13 ,14

Initiation of the HPV vaccine (ie, receiving at least one dose) has been shown to be higher among minority adolescent girls; however, completion of the three-dose series is substantially lower among African-Americans and Hispanics compared to Caucasians.15 Although male vaccination data are very limited, racial and income differences have also been observed among adolescent boys.16 Disparities in receipt of the HPV vaccine have also been found to be associated with insurance covering the costs of the vaccine, clinical provider characteristics (eg, age of physician, paediatricians and physicians with a private medical practice), poverty and parental perceptions of the HPV vaccine.16–22

Research on the geographic variability of HPV vaccination is limited, and has relied on data collected from large national surveillance programmes to estimate uptake at the state or county levels.23–25 These national data on geographic variation in HPV vaccine uptake may mask a considerable amount of variability at the local (eg, county, census tract, or ZIP code) level. Further, a major limitation of these geographic studies is that they do not account for the areal units from which geographically-defined data are collected, commonly referred to as the spatial structure of the data. Data collected in this manner typically exhibit spatial dependence (also referred to as spatial autocorrelation), with observations from areal units close together tending to have similar values.26 Although a proportion of spatial dependence may be modelled by including known covariate risk factors (ie, age, race and sex) in a traditional (non-spatial) regression model, it is common for spatial structure to not be accounted for and to remain in the residuals even after accounting for these covariate effects.26 For example, one study noted several individual factors that were associated with receipt of HPV vaccination, including geographic region of residence, however, they only used a categorical variable to account for geographic differences in uptake.27 Another study that analysed geographic variation in HPV vaccine uptake used a weighting scheme to account for dependence between study participants, but ignored the spatial dependence of respondents in neighbouring geographic regions.23 Thus, these studies inherently assume that factors associated with HPV vaccine uptake are homogeneous across areal units such as states or counties. Documenting geographic variation in vaccine disparities at local levels may help to identify specific areas with the largest disparities in HPV vaccine uptake (after accounting for spatial dependence) thereby informing outreach efforts, and may also provide new hypotheses regarding the underlying determinants of geographic patterns in uptake.

The objective of this study was to use HPV vaccination data measured at the ZIP code level to identify geographic variation in vaccine uptake, and to identify factors associated with the receipt of HPV vaccination while accounting for spatial dependence.



This study utilised data collected on 1003 participants from the Survey of Minnesotans About Screening and HPV (SMASH) study, which is a cross-sectional study of English-speaking men and women aged 18–30 years from the Twin Cities Metropolitan Area of Minnesota, and has been described elsewhere.28 Briefly, from November 2012 to January 2013, targeted advertisements were displayed on the social networking site Facebook to men and women who met the study eligibility criteria (as specified in their user profiles). Men and women who clicked on a study advertisement were redirected to the secured SMASH study website and invited to participate in an online survey. After providing consent, participants were asked to answer questions regarding HPV vaccination, cancer screening, and barriers/intentions regarding receipt of either.

The response to the question ‘Have you ever received an HPV vaccine?’ was used as the current study's outcome variable for HPV vaccination status. Individuals (n=128) who responded don't know, refused, or who did not respond to this question were excluded from the study. Similarly, individuals who did not report their ZIP code (n=3), or who reported a ZIP code outside of the predetermined 25-mile radius of downtown Minneapolis, Minnesota (n=112) were excluded from the study in order to focus on this diverse metropolitan population. The resulting study sample consisted of 760 (75.8% of total enrolled) men and women nested within 99 ZIP codes within downtown Minneapolis (see figure 1).

Figure 1

The spatial distribution of 760 survey responses across the Twin Cities Metropolitan Area of Minnesota.

Spatial data analysis

We tested for spatial autocorrelation in the crude HPV vaccination uptake rates using choropleth maps and Moran's I.29 Positive (negative) values of I indicate positive (negative) spatial correlation, meaning that nearby ZIP codes tend to exhibit similar (dissimilar) HPV vaccine uptake rates. The spatial adjacency of the data was defined in three different ways: rook contiguity, queen contiguity and using the five nearest neighbours. Model results did not vary substantially by the neighbourhood definition; therefore the queen contiguity structure was selected for the subsequent analyses.

Spatially dependent data violate the independence assumption required for generalised linear models. As such, ignoring the dependence of spatial data can lead to an underestimation of SEs, resulting in overly narrow CI estimates and, consequently, incorrect statistical inference.30 To account for residual dependence the linear predictor can be augmented with a spatial random effect, as part of a Bayesian hierarchical model.31 These random effects typically take the form of a conditional autoregression (CAR), which introduces spatial dependence through the adjacency structure of areal units.31 CAR models are generally applied in a Bayesian setting, where inference is based on Markov-chain Monte Carlo (MCMC) simulation.32

To accommodate the potential spatial dependence of HPV vaccination, we implemented a spatial logistic regression model using ZIP code as the areal unit of analysis. To accomplish this, assume Yi is the number of respondents who were vaccinated against HPV out of the total Ni sampled in each ZIP code j. The outcome can be modelled as a binomial response Yij ∼ bin(pij, Nij) such that pij is the true vaccine uptake proportion of individual i within a selected ZIP code j. The proportions were smoothed using the following model,Embedded Image 1where α is an intercept, which is interpreted as an overall log-odds coverage for all areas; β are the effects of the covariates Xij in the model; and the sj are spatially dependent random effects, such that neighbouring areas have a similar vaccine uptake proportion. The parameter ρ (Rho) reflects the spatial dependence inherent in the data, measuring the average influence of a given ZIP code on neighbouring ZIP codes.31 ,33 ,34 Including information from neighbouring ZIP codes to further inform the estimate for each ZIP code, even when the sample size is small, creates sufficient statistical power to generate reliable estimates.35 This is achieved by assuming a proper CAR prior, defined as N(sj|k, 1/τsmj), where si|j is the pooled mean of area j, based on the adjacent areas k, and mj are the number of ZIP codes neighbouring j, while τs is the precision that controls the amount of smoothing.36 ,37 By convention, the intercept and regression coefficients were assigned a conservative normal prior with a mean of 0 and a SD of 1 000 000. Estimation of the model parameters was carried out with MCMC simulation techniques that were implemented in R V.3.0.1 (R Development Core Team, 2014). Model convergence was monitored using a Monte Carlo SE threshold of 0.1.38 For this analysis, a total of 1 000 000 posterior samples were generated.

All statistical models included a priori factors potentially associated with HPV vaccine uptake, including sex (categorised as male or female), age (mean-centered), race (categorised as Caucasian, African-American, American Indian/Alaska native, Asian or other), ethnicity (categorised as Hispanic or non-Hispanic), educational attainment (categorised as some high school, high school graduate, some college or technical school, college graduate or graduate school), sexual orientation (categorised as heterosexual, homosexual/gay/lesbian or bisexual), and political views (categorised as very conservative or conservative, moderate, liberal or very liberal). Initially, the model was fit maintaining all the variables. The final model retained all covariates that were statistically significant at p<0.05. ORs and the associated 95% credible intervals are presented. The random effect terms can be interpreted as the effect of ZIP code on HPV vaccination uptake for each individual.


Characteristics of the study sample are presented in table 1. In all, 46.2% of participants had received at least one dose of HPV vaccine, with 67.7% of women reporting having been vaccinated compared to 13.0% of men. Of those who initiated the vaccine series, 71.1% completed the entire three-dose series (79.6% of women and 26.3% of men). Participants who had been vaccinated against HPV (ie, received≥1 dose of the vaccine) were younger (30.1% of those ≥25 years were vaccinated compared to 69.9% of those <25 years). Vaccine receipt was lower among those who identified themselves as politically ‘conservative’ or ‘very conservative’ as opposed to politically ‘liberal’ (24.6% compared to 53.3%).

Table 1

Characteristics of study participants by HPV vaccination status

HPV vaccination was found to exhibit strong spatial dependence ((Embedded Image). The CAR model also successfully converged, as the maximum Monte Carlo SE was 0.028 (which was below our threshold of 0.1), indicating that a sufficient number of posterior samples were generated for the estimates to stabilise. Estimates for the best-fitting CAR model are shown in table 2. After accounting for spatial dependence using the CAR model, age, sex, education, and political preferences remained significantly associated with HPV vaccine receipt. Specifically, older age (OR=0.77 per year, 95% CI 0.72 to 0.83) and being male (OR=0.03, 95% CI 0.02 to 0.06) were associated with a decreased odds of HPV vaccine receipt. Higher educational attainment (referent to receiving some high school or high school graduates) was associated with an increased odds of HPV vaccine receipt (some college OR=2.58, 95% CI 1.14 to 5.83; college graduate OR=3.93, 95% CI 1.66 to 9.30; graduate degree OR=4.74, 95% CI 1.71 to 13.17). Moderate and liberal political preferences (referent to very conservative and conservative preferences) were also associated with an increased odds of HPV vaccine receipt (moderate OR=3.24, 95% CI 1.62 to 6.49; liberal OR=5.32, 95% CI 2.68 to 10.58). Race was not found to be significantly associated with HPV vaccine uptake. For comparison, ORs (and corresponding 95% CIs) from a traditional logistic regression model that does not account for spatial dependence were also estimated and are also presented in table 2. Compared to the traditional logistic model, estimates from the CAR model were greater in magnitude for all covariates. Of note, in the traditional logistic regression model, having received some college education was not a statistically significant factor, but became significant in the CAR model (traditional OR=1.88, 95% CI 0.90 to 3.93; spatial CAR OR=2.58, 95% CI 1.14 to 5.83).

Table 2

OR estimates for factors associated with HPV vaccination from traditional logistic regression and spatial CAR models

Figure 2 shows a choropleth map of HPV vaccine uptake attributable to the CAR random effects in the CAR model. These values represent the spatial heterogeneity of HPV vaccine uptake conditional on population size and the factors included in the model. Heterogeneous HPV vaccine uptake is evident, in that a cluster of ZIP codes with lower uptake is concentrated in the downtown area (shown in light blue), with uptake increasing as distance from city centre increases (dark blue ZIP codes).

Figure 2

Uptake of the human papillomavirus vaccine that is attributable to the conditional autoregressive random effects in the spatial CAR model. CAR, conditional autoregression.


In this study, HPV vaccination was found to exhibit strong spatial dependence, indicating that spatial statistical models are needed to accurately identify and estimate factors associated with HPV vaccine uptake. The spatial analysis also revealed that ZIP codes tend to have HPV vaccine uptake rates that were similar to their neighbours. Ignoring this spatial dependence can lead to biased point estimates and overly narrow credible intervals. Consistent with other studies, younger age, female gender, higher education, and political views were found to be significantly associated with HPV vaccination (after accounting for spatial dependence).21 ,27 ,39 ,40 The associations of age and sex with HPV vaccine receipt can be attributed, in part, to the evolving ACIP recommendations, as they were first recommended for use in young girls and were later expanded to include young boys. Conservative political views have also been found to be associated with decreased knowledge of HPV, lower perceived risk of infection with HPV and stronger views against premarital sex.41

However, contrary to other studies that have not accounted for spatial dependence, this study found that race was not significantly associated with HPV vaccination.21 ,27 ,39 ,40 ,42 ,43 Racial disparities (and other disparities) have been shown to be pronounced in some areas, while less evident (or absent) in other areas.44–46 Although the existence of these disparities is well documented, the overall average effects (ie, national level data) can mask variation across local areas.47 ,48 For example, in a traditional regression analysis where minority girls live in regions with systematically different rates of HPV vaccine uptake, and the region is not controlled for, one could erroneously conclude that racial ‘disparities’ exist when in fact where people live (eg, the context of their neighbourhood) is the significant factor associated with vaccination. Thus, ignoring geography (ie, the spatial dependence of the data) may lead to incorrect inference. Previous studies that have attempted to describe geographic variation in HPV vaccine uptake have either ignored spatial dependence completely or have not correctly accounted for it using spatial statistical models.24 ,49 These studies may have incorrectly concluded that covariates such as race are significantly associated with HPV vaccine receipt when, in fact, these conclusions are likely to be erroneous because they are based on models that did not account for spatial dependence. As our analysis shows, using models that account for spatial dependence may greatly improve the identification of independent factors that are truly associated with HPV vaccination (as opposed to spatially confounded covariates), particularly when analysing data from varying geographic locations.

Previous studies have shown that HPV vaccination uptake exhibits significant geographic variability.23–25 ,27 HPV vaccine policies, availability, costs, poverty, financial assistance, and availability of education materials to promote uptake collectively contribute to this variability, as they vary widely across and within states.18 ,50 As a result, variation at state levels may not reflect the variation in HPV vaccine uptake occurring at a more local level. However, a more refined level of analysis was not possible in these studies because of the sparseness of data at the county and ZIP code level, which is in part attributable to national surveys aggregating or suppressing responses due to participant identification concerns. One strength of this study is that ZIP code level data were available to conduct a more detailed spatial analysis.

The proportion of all adults in this study who had been vaccinated against HPV (ie, received at least one dose of the HPV vaccine) was 46.2% (67.7% for women and 13.0% for men). These estimates are much higher than the HPV vaccine coverage estimates from the 2012 National Health Interview Survey (NHIS) for women (34.5%) and men (2.3%) aged 19–26 years.38 Although the results for women are more similar to those obtained from the National immunisation Survey—Teen for girls (53.8%), the estimate for men is much lower than the NIS-Teen estimate for boys (20.8%) aged 13–17 years who received at least one dose of HPV vaccine in 2012.39 Although the differences in the observed rates may be partially explained by the sampling frame, response rates, or the small number of eligible respondents who received the HPV vaccine question series in the national surveys, the estimates of HPV vaccine uptake are noticeably different from the current study.

There are several limitations to this study. First, all study measures were self-reported by persons over the Internet and may be subject to under or over-reporting. However, recent studies have shown recall of HPV vaccination status to be accurate.51 In addition, Internet-based studies have shown increased self-disclosure and reporting with online surveys, which may reduce potential response biases (eg, interviewer bias or social desirability).52 ,53 Second, analyses by race may have been underpowered due to small numbers, however, the distribution of racial groups was proportionate to estimates from the US Census for the study area.28 Similarly, we cannot rule out selection bias although several procedures were utilised to obtain a representative sample.28 Third, this study used the age of participants at the time of the survey, not the age of participants at the time of vaccination, to assess differences by age. It should be noted that our objective was to estimate factors associated with the overall prevalence of vaccine uptake among young adults, not to estimate prevalence by age. Fourth, the spatial analyses were conducted at the ZIP code level and assume a common ZIP code level effect, so within-ZIP code differences may be masked. However, to the best of our knowledge, this is only the second study to examine HPV vaccination at such a small areal unit.48 Another limitation is that this study did not directly adjust for the income of participants, as this information was not available. However, accounting for spatial dependence in this study sample likely incorporates some of the variability for unmeasured factors such as income.54 Finally, this study utilises cross-sectional data and temporal effects cannot be established.

In conclusion, the results from this study demonstrate that more detailed and local assessments of HPV vaccine uptake that account for spatial dependence are necessary as ZIP code level patterns differ significantly from aggregated state and national patterns. Future work is needed to further pinpoint areas with the greatest disparities and how to then access these populations to improve vaccine uptake.



  • Contributors SLK and EJN conceived and designed the study. EJN also conducted the data collection for this study and drafted the manuscript. JH, JSP, and JMO assisted in the survey design, supervised the statistical analysis, and assisted in reviewing/revising the manuscript. SLK also provided contributions to the concept and analytic approach for the article, and oversaw the analysis, interpretation, and reviewing/revising of manuscript. All authors read and approved the final manuscript.

  • Funding This study was supported by the J.B. Hawley Student Research Award from the University of Minnesota School of Public Health and by the Minnesota Medical Foundation through Grant 4120-9227-12.

  • Competing interests None declared.

  • Ethics approval University of Minnesota Institutional Review Board.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Only SMASH study team members have full access to the raw data. Researchers interested in using SMASH study data may request permission directly from the authors and will be considered on a case-by-case basis.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.