Objectives To evaluate associations of community types and features with new onset type 2 diabetes in diverse communities. Understanding the location and scale of geographic disparities can lead to community-level interventions.
Design Nested case–control study within the open dynamic cohort of health system patients.
Setting Large, integrated health system in 37 counties in central and northeastern Pennsylvania, USA.
Participants and analysis We used electronic health records to identify persons with new-onset type 2 diabetes from 2008 to 2016 (n=15 888). Persons with diabetes were age, sex and year matched (1:5) to persons without diabetes (n=79 435). We used generalised estimating equations to control for individual-level confounding variables, accounting for clustering of persons within communities. Communities were defined as (1) townships, boroughs and city census tracts; (2) urbanised area (large metro), urban cluster (small cities and towns) and rural; (3) combination of the first two; and (4) county. Community socioeconomic deprivation and greenness were evaluated alone and in models stratified by community types.
Results Borough and city census tract residence (vs townships) were associated (OR (95% CI)) with higher odds of type 2 diabetes (1.10 (1.04 to 1.16) and 1.34 (1.25 to 1.44), respectively). Urbanised areas (vs rural) also had increased odds of type 2 diabetes (1.14 (1.08 to 1.21)). In the combined definition, the strongest associations (vs townships in rural areas) were city census tracts in urban clusters (1.41 (1.22 to 1.62)) and city census tracts in urbanised areas (1.33 (1.22 to 1.45)). Higher community socioeconomic deprivation and lower greenness were each associated with increased odds.
Conclusions Urban residence was associated with higher odds of type 2 diabetes than for other areas. Higher community socioeconomic deprivation in city census tracts and lower greenness in all community types were also associated with type 2 diabetes.
- diabetes & endocrinology
- public health
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Strengths and limitations of this study
Type 2 diabetes, with a large sample size, was objectively documented and verified or excluded with extensive biomarker and medical data.
Temporality was appropriate for all independent variables.
We studied several approaches to community characterisation at more relevant contextual scales than many prior studies in a range of communities from urban to rural.
We did not measure behavioural mediators of the community definitions and features, such as physical activity or dietary intake.
We could not account for residential selection bias, but the residential stability and general population representativeness of our study population may mitigate these concerns.
Diabetes is a common and costly chronic disease; in the USA in 2018, over 34 million individuals had diabetes, with annual spending exceeding $320 billion.1 Diabetes occurrence varies by race/ethnicity and also evidences geographic disparities2 3; prevalence by county in the USA varies over a sevenfold range.4 Studies report that diabetes is 17% more prevalent in rural than urban areas,5 consistent with rural health disparities for other chronic conditions,6 7 attributed to sociodemographic factors (eg, higher poverty, older populations) and barriers to healthcare access.8 9
Community characteristics that may underlie observed geographic disparities in type 2 diabetes include land use (eg, walkable vs automobile dependent), fitness, food and social (eg, deprivation, disorganisation) environments; greenspace (ie, natural environments); and air pollution. Some of these are diabetogenic and others protective.10–12 Community characteristics co-occur in patterns that differ by community type (eg, higher population density co-occurs with higher deprivation and food availability and lower automobile dependence and greenness). Simultaneous evaluation and control of these domains across community types can be problematic due to limited and non-overlapping distributions that make independent attribution of disease risk to specific domains difficult.13 An alternative is to use carefully defined community types to first identify the location and geographic scale of type 2 diabetes risk.14–17 These community types should reduce within community variation and maximise between community differences. Subsequent analyses can then stratify by community type and evaluate well-characterised community features in relation to type 2 diabetes risk.
Residential development patterns reflect a continuum from rural to urban with variation by many community features.18 The US Census Bureau defines urbanised areas as dense settlements with 50 000 or more residents, urban clusters as areas with 2500–50 000 residents, and all others as rural.19 In Pennsylvania, communities are defined administratively as townships, boroughs, and cities using census minor civil division boundaries.20 In combination, these two definitions provide an opportunity to evaluate experientially and behaviourally relevant geographies as well as to further subdivide the broad category of ‘rural’, which includes a range of communities that vary in their associations with health outcomes.21 22
We evaluated four definitions of community across a range of community types from rural to urban in a 37-county region of Pennsylvania, in relation to type 2 diabetes onset to inform more robust study of the community-level features that may underlie type 2 diabetes risk. Next, because higher community socioeconomic deprivation and lower greenness have been consistently associated with higher risk of type 2 diabetes,23 24 we evaluated associations with these features overall and within community types.
Study population and design
This study was conducted by Geisinger-Johns Hopkins Bloomberg School of Public Health, one of four academic research centres in the Diabetes LEAD (Location, Environmental Attributes, and Disparities) Network (http://diabetesleadnetwork.org/), a collaboration funded by the Centers for Disease Control and Prevention dedicated to providing scientific evidence to develop targeted interventions and policies to prevent type 2 diabetes and related health outcomes across the USA.
Using previously reported methods,20 we used Geisinger electronic health record (EHR) data from 1.6 million individuals to identify new onset type 2 diabetes from 2008 to 2016. Individuals represent the general population in the region with high residential stability.25 The study area included 37 counties in Pennsylvania (figure 1). These data were used in a nested case–control study.
Patient and public involvement
Patients and public representatives were not involved in the development of the study. Study results will be disseminated through Geisinger’s Environmental Health Institute in its website (https://www.geisinger.edu/research/departments-and-centers/environmental-health-institute) and communications to Geisinger patients and the public.
Identification of new onset type 2 diabetes cases and controls
Persons with type 2 diabetes (n=15 888) were identified using diabetes encounter diagnoses, medication orders and laboratory test results (online supplemental table S1). EHR algorithms can identify diabetes with high sensitivity, specificity and positive predictive value.26 27 Controls (n=79 435, with 65 084 unique persons), persons who never met any of the diabetes criteria used for cases, were randomly selected with replacement and frequency-matched to cases (5:1) on age, sex and year of encounter. To ensure that we could identify diabetes if present, we required at least two encounters on different days with a primary care provider prior. To ensure diabetes was new onset, persons had to have at least one encounter with the health system at least 2 years prior without evidence of diabetes.
Community types and community features
Addresses at last contact with the health system were geocoded using ArcGIS V.10.4 (ESRI, Redlands, California, USA). We used four definitions of community, defined as administrative community type, urban/rural status, combined community type and county, to evaluate different spatial scales and a range of characterisations of the size and urbanicity of these areas (figure 2). First, using minor civil divisions and census tract boundaries, we categorised study communities into townships, boroughs and city census tracts, as previously reported,28 referred to as administrative community type. Townships range from agriculturally focused rural areas to low density suburbs; boroughs are walkable small towns of 5000–10 000 persons with a core area of gridded streets; and cities are medium-sized urban areas (largest is Scranton-Wilkes-Barre-Hazleton Metropolitan Statistical Area, 97th in USA by population). Second, we used US Census Bureau’s urbanised areas and urban clusters to define residential addresses as ‘major urban’, ‘smaller urban’ and ‘rural’,19 referred to as urban/rural status. Third, to evaluate community at a more granular level, we combined the first and second categorisations, referred to as combined community type. This resulted in eight groups (city census tract/rural had few residences so were combined with borough/rural; township/rural was the reference group). Fourth, because most prior research of geographic disparities in diabetes evaluated counties, which are much larger geographies, we evaluated counties alone and after stratification by administrative community type.
We evaluated two time-varying community features. Peak (16-day composite in early July of each year) normalised difference vegetation index (referred to as greenness) was evaluated in 1250 x 1250 m2 around residences in the prior year.29 We measured community socioeconomic deprivation using a previously described scale,30 the sum of z-transformed values of six indicators identified from a factor analysis (proportion unemployed, less than a high school education, below poverty level, on public assistance, not in the workforce and without a car), using data from the Decennial Census (2000 only) and American Community Survey (2006–2010, 2011–2015). The scale was assigned as the closest measure prior to the year of onset/encounter.
The goals of the analysis were: (1) evaluate four definitions of community in relation to odds of type 2 diabetes onset; (2) evaluate two community features, community socioeconomic deprivation and greenness, in relation to type 2 diabetes onset in all communities; and (3) evaluate associations of the two community features after stratification by community type. Analysis controlled for key individual-level confounding variables and accounted for spatial clustering of persons within communities. Statistical analysis was completed using Stata-MP V.15.1.
Logistic regression was used to estimate associations (ORs, 95% CIs) using generalised estimating equations with robust SEs and an exchangeable correlation structure within administrative community types. We adjusted for age (years; linear, quadratic and cubic terms to allow for non-linearity), sex, race (white vs all other races), ethnicity (Hispanic vs non-Hispanic) and per cent of time using Medical Assistance (surrogate for family socioeconomic status (≥50% vs <50%)).31 We did not include body mass index (kg/m2) in models because this is likely a mediator of community associations (inclusion would attenuate or eliminate associations of interest). Models were first evaluated using all persons in all communities. We analysed associations of the four definitions of community, community socioeconomic deprivation (quartiles; fourth quartile (worst deprivation) reference group) and greenness (tertiles) with diabetes status. Due to concerns about non-overlapping distributions resulting in extrapolation rather than adjustment (ie, non-positivity32), we then stratified the community features models by community type.
In sensitivity analyses, to evaluate whether access to care—and thus higher likelihood of diabetes diagnosis—may have accounted for associations between community and diabetes, we examined the number of prior outpatient encounters (linear and quadratic terms) for study individuals by administrative community type and Medical Assistance status and added this variable to regression models.
Description of study population and communities
Individuals were predominantly white and non-Hispanic; the majority had a primary care provider; and most cases were diagnosed with diabetes in an outpatient setting (table 1). Individuals resided in 291 boroughs, 146 city census tracts and 633 townships (online supplemental table S2). Over 40% of persons resided in rural areas (table 1). Most borough residents were divided between urbanised areas and urban clusters. Approximately two-thirds of persons in townships resided in rural areas. A similar proportion of individuals in city census tracts resided in urbanised areas. On average, townships had higher greenness and lower community socioeconomic deprivation compared with boroughs and city census tracts (online supplemental table S2). Average racial and ethnic diversity and use of Medical Assistance for health insurance were highest in city census tracts. The mean total number of encounters with the health system before diabetes onset or the control selection date was high for all individuals, in all community types, regardless of Medical Assistance status (online supplemental table S3). Laboratory data confirmed that the categorisation of diabetes cases and controls was valid (online supplemental table S4).
Associations of communities with type 2 diabetes onset
In the base model, controlling for age and sex, non-white race (vs white), Hispanic ethnicity (vs non-Hispanic) and Medical Assistance status were each associated with increased odds of type 2 diabetes onset. These associations did not substantively change as the community type and community features were added to the model. ORs for non-white race (vs white) ranged from 1.36 to 1.41, for Hispanic ethnicity (vs non-Hispanic) from 1.46 to 1.52 and for Medical Assistance (≥50% of time vs <50%) from 1.71 to 1.74, with all confidence intervals excluding 1.0. Next, when administrative community type was added (townships as reference group), residing in boroughs and city census tracts was associated with significantly higher odds (table 2, model 1). Second, urban/rural status was added to the base model and residing in urbanised areas (vs rural areas) had increased odds of diabetes onset (table 2, model 2). Third, the combined definition was added to the base model, and some categories (eg, city census tracts in major urban and smaller urban areas highest, boroughs in these areas intermediate, versus townships in rural areas as reference) were associated with increased odds of new onset diabetes (table 2, model 3). Finally, county was added to the base model, and seven counties were associated with reduced odds and two with increased odds of diabetes (table 2, model 4). We next evaluated community socioeconomic deprivation and greenness. When these community features were added to the base model, lower deprivation (table 2, model 5) and higher greenness (table 2, model 6) were associated with reduced odds of diabetes.
Models were next stratified by community type (only results for administrative community type shown). Race/ethnicity and Medical Assistance status were still associated with type 2 diabetes onset in the stratified models in all administrative community types (online supplemental table S5). Associations of community socioeconomic deprivation with diabetes evidenced decreasing ORs across decreasing deprivation quartiles in all community types, but only crossed an inferential threshold in city census tracts, with approximately 25% lower odds in the first versus fourth quartile. Higher greenness was associated with reduced odds of diabetes in all community types.
Even after stratification by administrative community type and adjustment for community socioeconomic deprivation, several counties were independently associated with increased or reduced odds of diabetes onset (online supplemental table S6). The number of significant associations (n=18, nine each with reduced or increased odds) was somewhat larger than that expected due to chance (108 statistical tests performed), with most associations observed for residing in boroughs. In these models, associations with community socioeconomic deprivation were present in the first quartile (vs fourth) in townships and boroughs and in all quartiles in city census tracts. In all community types, higher greenness was associated with lower odds of diabetes.
Addition of total outpatient encounters before diagnosis/control selection date did not substantively change associations in non-stratified or stratified models (results not shown). Community socioeconomic deprivation and greenness were evaluated together in models in boroughs and townships. In boroughs, associations of greenness with type 2 diabetes onset were attenuated by 1%–2% and associations with community socioeconomic deprivation were no longer present. In townships, there was no substantive change in associations or inferences for greenness and associations with community socioeconomic deprivation were no longer present. These variables could not be evaluated together in city census tracts due to insufficient overlap in distributions.
There is great interest in understanding geographic disparities in type 2 diabetes risk. If the primary causes of these differences were community-level factors, community-level interventions could have large impacts on diabetes risk. A strong theoretical basis, and growing empirical evidence, indicates that community features contribute to diabetes risk directly or through increased risk of obesity, such as social, built and natural environments contributing to impacts on physical activity and stress.33–35 The primary goal of this study was to evaluate geographic disparities in type 2 diabetes by evaluating four definitions of community across the full range from rural to urban. We then evaluated associations of community socioeconomic deprivation and greenness overall and in models stratified by community type, the latter greatly reducing the degree to which these associations could be confounded by other community features.
In the study region, the use of combined community type allowed us to carefully identify the location and scale of risk. Risk of new onset type 2 diabetes was highest in cities in smaller urban areas, followed by cities in major urban areas and boroughs in major and smaller urban areas. In addition, even after accounting for community type and features, county was independently associated with diabetes onset. While many prior studies have evaluated county differences in diabetes risk,4 36–38 none have also simultaneously evaluated communities. Our associations suggest that the risk factors that undergird US geographic differences in diabetes likely exist at multiple, nested spatial scales. Some of the county associations were of high magnitude (eg, exceeded 1.5 for protection or risk). Finally, there were consistent associations of higher community socioeconomic deprivation and lower greenness with higher diabetes risk, the former primarily in city census tracts, where average deprivation levels were higher, and the latter in all communities. We do not believe that the apparent lower diabetes risk in rural areas was due to less likely diagnosis due to lower access to healthcare, since, on average, individuals in the study, regardless of Medical Assistance status and community type, had high contact with the healthcare system.
We found several strong and consistent associations of individual-level characteristics. Non-white race, Hispanic ethnicity and Medical Assistance status (a surrogate for low family socioeconomic status) were consistently associated with 1.3 to 1.7-fold increased odds of type 2 diabetes onset. Overall, the findings suggest that sociodemographic factors (race/ethnicity and individual-level socioeconomic status), urbanicity, higher community socioeconomic deprivation and lower greenness, all of which co-occur in our region, were strong risk factors for type 2 diabetes.
Our findings on elevated risk of type 2 diabetes onset in urban areas is inconsistent with national studies that have reported higher crude prevalence estimates of type 2 diabetes in rural areas.39 However, a study of the Behavioral Risk Factor Surveillance System found that after adjusting for individual-level socioeconomic measures, prevalence was higher in urban areas.40 Geospatial predictors of diabetes risk likely vary by community and region; prior studies have reported, for example, that nine county-level measures of socioeconomic, race/ethnicity and built environmental features explained up to 94% of the variation in type 2 diabetes prevalence in the Midwest, but very little variation in Pennsylvania.36
The associations of greenness with diabetes were consistent with prior studies, but our results are the first to demonstrate robust findings across all types of communities while additionally controlling for county. The measurement of community features across community types may result in measures with different interpretations in different communities and regions; for example, agricultural, coniferous forest and deciduous forest greenness are not evenly distributed and have different impacts on health.22
Most prior studies of geographic disparities in diabetes have been cross-sectional, at the ecological level, relying on self-reported diabetes and focused on prevalent diabetes by county (too large and heterogeneous) or census tract (not experientially and behaviourally relevant). The current study avoided all these limitations. In addition, while many public health services are delivered at the county level, many potential interventions to address diabetes would need to be implemented at smaller scales and would not have county-wide impacts.
The study had some limitations. Although we adjusted for Medical Assistance health insurance as a surrogate for family socioeconomic status, there could still be residual confounding by individual-level income.31 We did not measure behavioural mediators of the community definitions and features, such as physical activity or dietary intake. We could not account for residential selection bias, in which associations are due to reverse causation (if persons with individual-level risk factors for diabetes are more likely to reside in certain areas, by choice or opportunity). This can be a concern in studies of this type; social processes determine residence, so it can be difficult to distinguish individual-level characteristics from features of communities.41 The residential stability and general population representativeness of our study population may mitigate these concerns. Although we used four definitions of community, all used administrative boundaries and thus may not represent how residents view the communities in which they reside and could still present edge and boundary effects and the modifiable areal unit problem.42–44
The study had several strengths. Diabetes was objectively documented and verified with extensive biomarker and medical data. Temporality was appropriate for all independent variables. Study participants resided in a range of communities from urban to rural. We studied several approaches to community characterisation at more relevant contextual scales than many prior studies and showed that smaller community contexts were associated with diabetes onset. Stratifying by community types limited bias from non-positivity.32
The study findings provide important clues for the location (ie, urban) and geographic scale (ie, as localised as a square mile, the average area of boroughs and city census tracts) that identifies geospatial disparities in type 2 diabetes in Pennsylvania. We speculate that, since risk was higher in urban areas, our findings may suggest a smaller role for the positive features of the food and physical activity environments present in these areas (eg, greater access to grocery stores, more walkable neighbourhoods, more commercial physical activity opportunity establishments) and a larger role for individual and community demographic and socioeconomic factors found in the same areas.
Contributors Manuscript authors contributed in the following ways. Conception of work: BSS, MNP, KS, CM, GI, AGH. Obtained funding: BSS, AGH. Study design: BSS, JP, KB-R, AGH. Data management and analysis: JP, KB-R, BSS, MNP, JD, KAM, AGH. Results interpretation: BSS, MNP, KB-R, JD, KM, KS, CM, GI, AGH. Initial manuscript writing: BSS, MNP, KAM, AGH. Critical revision of manuscript, final approval and accountable for their work: BSS, JP, MNP, KB-R, KM, JD, KS, CM, GI, AGH.
Funding This publication was made possible by Cooperative Agreement Number DP006293 funded by the US Centers for Disease Control and Prevention, Division of Diabetes Translation.
Map disclaimer The depiction of boundaries on this map does not imply the expression of any opinion whatsoever on the part of BMJ (or any member of its group) concerning the legal status of any country, territory, jurisdiction or area or of its authorities. This map is provided without any warranty of any kind, either express or implied.
Competing interests None declared.
Patient consent for publication Not required.
Ethics approval The study was approved by the Geisinger Institutional Review Board under waivers of consent and assent to use electronic health record (EHR) data.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data are available upon reasonable request. Deidentified data are available upon request with IRB approval and a data use agreement.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.