Purpose Globally, the age-standardised prevalence of type 2 diabetes mellitus (T2DM) has nearly doubled from 1980 to 2014, rising from 4.7% to 8.5% with an estimated 422 million adults living with the chronic disease. The MULTI sTUdy Diabetes rEsearch (MULTITUDE) consortium was recently established to harmonise data from 17 independent cohort studies and clinical trials and to facilitate a better understanding of the determinants, risk factors and outcomes associated with T2DM.
Participants Participants range in age from 3 to 88 years at baseline, including both individuals with and without T2DM. MULTITUDE is an individual-level pooled database of demographics, comorbidities, relevant medications, clinical laboratory values, cardiac health measures, and T2DM-associated events and outcomes across 45 US states and the District of Columbia.
Findings to date Among the 135 156 ongoing participants included in the consortium, almost 25% (33 421) were diagnosed with T2DM at baseline. The average age of the participants was 54.3, while the average age of participants with diabetes was 64.2. Men (55.3%) and women (44.6%) were almost equally represented across the consortium. Non-whites accounted for 31.6% of the total participants and 40% of those diagnosed with T2DM. Fewer individuals with diabetes reported being regular smokers than their non-diabetic counterparts (40.3% vs 47.4%). Over 85% of those with diabetes were reported as either overweight or obese at baseline, compared with 60.7% of those without T2DM. We observed differences in all-cause mortality, overall and by T2DM status, between cohorts.
Future plans Given the wide variation in demographics and all-cause mortality in the cohorts, MULTITUDE consortium will be a unique resource for conducting research to determine: differences in the incidence and progression of T2DM; sequence of events or biomarkers prior to T2DM diagnosis; disease progression from T2DM to disease-related outcomes, complications and premature mortality; and to assess race/ethnicity differences in the above associations.
- cardiac epidemiology
- preventive medicine
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Strengths and limitations of this study
The primary strengths of the MULTI sTUdy Diabetes rEsearch (MULTITUDE) consortium is the large sample size and generally long follow-up period that facilitates examination of type 2 diabetes mellitus (T2DM) risk and outcomes across the life course.
Pooling consortium data allow us to provide insights into the evolution of T2DM risk factors and prediabetes in early life with greater statistical power than has been available previously.
Furthermore, data from additional cohorts can be harmonised with the consortium to expand the MULTITUDE consortium to include more representative data or to improve the representation of minorities.
Limitations include apparent heterogeneity of measures across cohorts, including variation in clinical methodology and technology, questionnaire data and diagnostic criteria.
Long-term follow-up studies from our consortium that enrolled minorities only began tracking T2DM and cardiovascular disease events in the 1970s or later.
Type 2 diabetes mellitus (T2DM) is a chronic metabolic disease that can lead to complications in many body systems and increases the overall risk of chronic morbidity and premature death.1 Globally, the age-standardised prevalence of T2DM has nearly doubled from 1980 to 2014, rising from 4.7% to 8.5% with an estimated 422 million adults living with the chronic disease.2 It is currently the seventh leading cause of death in the USA with over 30 million Americans (9.4% of the US population) living with T2DM resulting in a total financial burden of US$245 billion per year.1 In adults, T2DM accounts for about 90%–95% of all diagnosed cases of diabetes and is commonly associated with obesity.1
The risk of T2DM is associated with an interplay of genetic and metabolic factors including: ethnicity, family history of T2DM, previous gestational diabetes, polycystic ovary syndrome (PCOS), older age, overweight and obesity, unhealthy diet, physical inactivity and smoking.3 4 The combination of increasing prevalence of T2DM and increasing lifespan of persons with diabetes may be altering the spectrum of morbidities that accompany T2DM. Complications include cardiovascular events (myocardial infarction (MI), heart failure, stroke), non-alcoholic fatty liver disease, kidney failure, and vision and neurological damage.5 The numerous and severe complications and increased years of life spent with T2DM indicate a need to better assess the trajectory of the disease and impact of various interventions and comorbidities, and the effect of attendant intermediate events on long-term outcomes.
An optimal approach to examining T2DM risk and disease progression involves the longitudinal examination of population-based cohorts. Further, integration and harmonisation of data from multiple population studies allows for sample sizes that are not obtained with individual studies. Furthermore, robust sample sizes and diverse cohorts improve the generalisability of results by increasing overall representativeness of the combined cohort.6–9 Another advantage to harmonising data across studies to create a single, large database is the facilitation of comparative effectiveness research.6 10
The MULTI sTUdy Diabetes rEsearch (MULTITUDE) Consortium was established in 2017 to harmonise data from 17 cohort studies and clinical trials and to facilitate a better understanding of the determinants, risk factors and outcomes associated with T2DM. The main research objectives of this project are to determine the relationship between the lifetime risk of T2DM and associated major risk factors, the transition-specific risk of adverse outcomes from T2DM diagnosis through intermediate morbidity to eventual mortality and to determine the temporal patterns of T2DM and related morbidity and mortality in the USA. Moreover, the MULTITUDE consortium enables evaluation of gender-specific outcomes in T2DM.
A comparable large-scale data harmonisation effort11 has already been undertaken to better understand the long-term risks for cardiovascular disease (CVD) and to examine patterns of CVD development over the adult life course. However, similar projects focused on T2DM have thus far been limited in their sample size,12 13 participant demographic make-up,12–14 years of follow-up12 or with their focus on improvements in care15 16 rather than on determining risk of diagnosis and adverse outcomes. Additionally, while multiple risk models17–19 aimed at early identification of patients at high risk of developing T2DM are already widely used in the clinical settings, these models typically only consider the patient’s current state at the time of the assessment, ignoring the complex trajectory of events that leads up to the disease state. The MULTITUDE consortium aims to address these limitations.
Study inclusion and follow-up time
The 17 cohort studies and clinical trials in the MULTITUDE consortium were included based on their availability as open source data from the National Heart, Lung, and Blood Institute (NHLBI) Biologic Specimen and Data Repository Information Coordinating Center (BioLINCC) and their relevance to T2DM risk and outcomes. All studies have been approved for the sharing of their data in this consortium per the NHLBI policy for data sharing from clinical trials and epidemiology studies (http://www.nhlbi.nih.gov/funding/datasharing.htm). These studies vary by study design, inclusion criteria, recruitment site and enrolment age, with recruited participants ranging in age from 3 to 88 years (table 1 and online supplementary table 1). Most studies included individuals with and without T2DM at baseline. All studies except the three cohorts of the Framingham Heart Study (FHS)20 include non-Caucasian patients/participants, and all but two include both men and women. Participants from 45 US states and the District of Columbia are represented across the 17 cohorts/trials making up the consortium (online supplementary table 2).
The participants recruited into the prospective cohorts in the consortium tended to be free of prevalent comorbidities, while patients enrolled in the clinical trials all suffered from either CVD, T2DM, obesity or hypertension. A diagnosis of T2DM was part of the inclusion criteria for the Action to Control Cardiovascular Risk in Diabetes (ACCORD)21 and Bypass Angioplasty Revascularization Investigation in Type 2 Diabetes (BARI 2D)22 clinical trials, while the same diagnosis excluded patients from enrolling in the Systolic Blood Pressure Intervention Trial Primary Outcome Paper (SPRINT-POP)23 trial. The number of enrolled individuals with diabetes within each individual cohort/trial is provided in table 1.
The study duration and follow-up examinations differ by study. Figure 1 demonstrates the calendar years of data collection and the time range in which examinations were performed. The earliest year of data collection was 1948 (inception of the FHS20) and the most recent 2017, with six cohorts still continuing follow-up examinations. The shortest interval of total follow-up, 2 months, comes from the Functional Outcomes in Cardiovascular Patients Undergoing Surgical Hip Fracture Repair (FOCUS)24trial, while the longest follow-up of 69 years comes from the original cohort of the FHS.20
Data collection methods
We pooled the data using three distinct steps of crosswalk, catalogue and harmonisation. Retrospective harmonisation is a complicated process since few established studies have used identical collection methods and procedures. We determined participating study attributes and type of information collected (eg, diagnoses, clinical laboratory values). As well, we documented information such as study designs, sampling protocols and data access policies in order to evaluate sources of study heterogeneity and feasibility of harmonisation.25 26 To enable harmonisation, we ensured that all the study-specific data items required to generate the target variables (tables 2–4; online supplementary tables 3–5) were available and that the collected information was valid. The approach used to process data under a common format varies depending on the variables to be harmonised, the data collected by each study and the possibility to pool data.26
The first step of harmonising data for the MULTITUDE consortium was the crosswalk of data measures (variables) across all studies. All available variables from individual studies within the consortium were identified and systematically entered in eight sections of (1) demographic data, (2) comorbidities, (3) laboratory values at diagnosis, (4) biomarkers, (5) medications, (6) ECG and echocardiogram (ECHO), (7) complications related to diabetes and (8) events. This crosswalk allows assessment of each variable and, in turn, allows us to determine the level of comparability between studies.
For example, in determining a baseline diagnosis of T2DM, individual studies may have dichotomous ‘yes’/’no’ data on a history or diagnosis of T2DM. Alternatively, studies may only have data on fasting glucose levels, random plasma glucose levels or haemoglobin A1c levels. At this stage of the process, all relevant variables were identified and collected from each study without alteration of the original data or creation of new MULTITUDE-specific target variables.
Following data element crosswalk, all variables were then catalogued based on their key characteristics and relevance in answering the research questions addressed. Recording of blood pressure may be different across studies: one study may have systolic and diastolic blood pressure measured by the technician and another could have reported values from medical records. Clinical outcomes can also be obtained from different sources such as medical records without independent adjudication, with independent adjudication or via self-report. All variables that are empirically similar or indicate the same measurement are grouped together and named under a common pooled variable. We evaluated which studies could provide data that enabled generation of each of the target variables and we qualitatively assessed the level of similarity between the study-specific and target variables.26 All relevant information describing data elements and collection modes such as data dictionaries, questionnaires and standard operating procedures was used for cataloguing and subsequently to assess the comparability of data collected by individual studies.25
Continuing with the previous example, all variables from each study relevant to a baseline diagnosis of T2DM would then be categorised together. Some variables may be in more than one category: data on fasting glucose levels could be both categorised under a continuous fasting glucose target variable as well as a dichotomous T2DM diagnosis target variable. These inconsistencies or spread of target variables are thus defined and placed under appropriate pooled variable headers.
We employed several established approaches26 for harmonisation of MULTITUDE data while minimising bias that could be introduced by systematic differences in measurement techniques across cohorts:
A simple calibration model to transform one continuous measure into another continuous measure to operate at the same unit of measurement (eg, transferring weight in kilograms to weight in pounds).
An algorithmic transformation to harmonise continuous and categorical variables, or both, with different but combinable ranges or categories (eg, to identify a baseline diagnosis of T2DM: ‘yes’ on a ‘history or diagnosis of T2DM’, fasting glucose levels ≥126 mg/dL, random plasma glucose levels ≥200 mg/dL or haemoglobin A1c levels ≥6.5%).
A standardisation model that harmonises the same constructs measured using different scales (eg, blood cholesterol concentration). The distribution of the measure is compared across cohorts to assess for differences in accuracy and/or precision in the measure.11
Data are only pooled if these methods are possible. We have also adjusted each analysis for cohort, which may help attenuate any confounding due to measurement differences or varying calendar decades across cohorts.
The baseline demographic and patient/participant information are provided in online supplementary table 3 and table 1. Briefly, age, sex, race and smoking status were reported across all MULTITUDE cohorts. The majority of studies also include information on employment, education, hospitalisations, alcohol intake, dietary intake, physical activity, blood pressure and body mass index. A select number of cohorts provide information on a family history of T2DM or CVD.
The baseline comorbidity information included in the MULTITUDE consortium is shown in table 2. All studies provided baseline information on dyslipidaemia, hypertension and the T2DM status of patients/participants. The majority of cohorts/trials include information on obesity, CVD, pulmonary disease and kidney disease. Information on specific types of CVD is also provided in most studies.
Medications and clinical laboratory values monitored either at baseline or throughout the study follow-up period are detailed in table 3 and online supplementary table 4. All studies tracked whether patients/participants self-reported the use of any dyslipidaemia (including statins) or CVD medication. The majority of cohorts/trials reported the use of any T2DM (insulin or oral) medication, as well as specific CVD medications. Most studies also included information about the fasting glucose and insulin levels, and serum lipids, potassium and haemoglobin A1c.
A select number of studies collected information from ECGs and/or ECHOs over the course of the study, as shown in online supplementary table 5. Many of these studies also included information regarding cardiac dysfunctions such as dysrhythmia, QRS axis deviation, ventricular conduction defects, ST-segment abnormalities and left ventricular hypertrophy.
Collection of data on these intermediate end points and medical interventions across the lifespan enables us to better understand the evolution of T2DM and its interplay with CVD. This allows us to more confidently identify potential causal pathways to T2DM-related and CVD-related events.
Outcomes of interest are shown in table 4. Events were ascertained using each cohort’s specific protocol and procedures. All but three studies within the consortium include data on all-cause mortality and the majority of studies provide information on cause of death. Both fatal and non-fatal events related to T2DM and CVD are tracked by most of the cohorts/trials in the MULTITUDE consortium. Specific types of CVD (angina, coronary artery disease, congestive heart failure, hypertension) as well as CVD-related events (MI, stroke/transient ischaemic event) and interventions requiring hospitalisation (percutaneous coronary intervention/coronary artery bypass grafting) are provided in many cohorts/trials. Initial diagnosis of T2DM as well as advanced stage outcomes of T2DM (renal failure, neuropathy, retinopathy) are also included in several studies.
Findings to date
The baseline characteristics of participants of the MULTITUDE consortium are presented in table 5. Among the 135 156 participants included in the consortium, almost 25% (33 421) were diagnosed with T2DM at baseline. The average age of the participants was 54.3 years, while the average age of participants with diabetes was 64.2. Men (55.3%) and women (44.6%) were almost equally represented across the consortium. Non-whites accounted for 31.6% of the total participants but 40% of those diagnosed with T2DM (<0.0001). Fewer individuals with diabetes reported being regular smokers than their non-diabetic counterparts (40.3% vs 47.4%, <0.0001). Over 85% of those with diabetes were reported as either overweight or obese at baseline, compared with 60.7% of those without T2DM (<0.0001).
Figures 2 and 3 show the age-adjusted incidence of all-cause mortality by baseline T2DM status for each study included in the MULTITUDE consortium. Due to the generally longer follow-up periods of prospective cohorts, we observed higher rates of overall mortality among these participants compared with clinical trials patients. We also observed more than two times the risk for mortality among individuals with T2DM in several prospective cohorts (HRs (95% CI): ARIC=2.36 (2.21 to 2.53), FHS cohort=1.28 (1.06 to 1.56), FHS offspring=2.65 (2.06 to 3.41), FHS Gen3=2.83 (1.20 to 6.70), Jackson Heart Study (JHS)=1.78 (1.52 to 2.10)) compared with lower risk among clinical trials (HR (95% CI): AFFIRM=1.84 (1.55–2.18), ALLHAT=1.36 (1.29 to 1.43), CORAL=1.28 (0.90 to 1.81), MRFIT=1.12 (0.79 to 1.58), SPRINT-POP=1.47 (0.37 to 5.92)).
Using data from 17 harmonised cohort studies and clinical trials, the MULTITUDE consortium is a unique compilation that was established to facilitate a better understanding of the determinants, risk factors and outcomes associated with T2DM. Given the wide variation in demographics and all-cause mortality in the cohorts, MULTITUDE consortium will be a unique resource for conducting research to determine: (1) age, time period and cohort differences in the incidence and progression of T2DM (2) the sequence of events or biomarkers prior to T2DM diagnosis (3) disease progression from T2DM to CVD outcomes, T2DM complications and premature mortality and (4) to assess race/ethnicity differences in the above associations. Using the same harmonisation principles, this data resource can be extended to include a larger number of studies to provide a more comprehensive data infrastructure as relevant data are added to the BioLINCC repository. Several promising large-scale retrospective data analyses focused on gaining a better understanding of T2DM risk and outcomes are currently under way.27 28
In our preliminary findings, we observed differences in demographics and all-cause mortality by baseline diabetes status. As has been previously shown,29 30 individuals with T2DM were more likely to be older, non-white and more overweight. However, interestingly, people diagnosed with T2DM in the MULTITUDE consortium were less likely to be cigarette smokers, a known risk factor for the disease. This can be explained by the finding that smoking cessation is associated with weight gain and a subsequent increase in risk of diabetes,31 as well as the possibility that health providers and patients increase their efforts at smoking interventions after T2DM diagnosis.32 33 The relatively low HRs seen in the clinical trials compared with prospective cohorts is likely reflective of the comorbidities present as part of the inclusion criteria of individual studies, that is, all patients enrolled in SPRINT-POP clinical trial were previously diagnosed with high blood pressure.
Additionally, we can make preliminary conclusions regarding race differences in all-cause mortality by T2DM status. The three cohorts of FHS consist entirely of Northern whites, while the JHS-recruited Southern blacks. The risk ratio of all-cause mortality among JHS participants with T2DM more closely resembles the original FHS cohort, recruited in 1948 when CVD risk factors were largely unknown and medical interventions more limited, than the risk ratio from their contemporaries in FHS Gen3. This suggests that either individuals with T2DM are more protected from mortality in the JHS cohort or, perhaps, that all individuals in this cohort are more at risk for mortality compared with FHS Gen3. It is likely that there is a complex interplay between genetics, lifestyle, culture and access to healthcare that remains to be explored further.
Strengths and limitations
The primary strength of the MULTITUDE consortium is the large sample size and generally long follow-up period that facilitates examination of T2DM risk and outcomes across the life course. Pooling data allow us to provide insights into the evolution of T2DM risk factors and prediabetes in early life with greater statistical power than has been available previously. Using the consortium data, we will be able to understand the variation in risk among different subgroups, including rare populations with T2DM and to observe the relationship of comorbid CVD and risk of outcomes in T2DM. Furthermore, data from additional cohorts can be harmonised with the consortium to expand MULTITUDE to include more representative data and to improve the representation of minorities.
The consortium also acknowledges a number of limitations. These include apparent heterogeneity of measures across cohorts, including variation in clinical methodology and technology, questionnaire data, and diagnostic criteria. As well, there are inherent differences in study design and methodology between clinical trials and cohort studies, which are combined in this consortium. The MULTITUDE consortium exclusively contains data from North American cohorts which may limit the generalisability of any significant findings to other global populations. We also acknowledge limited statistical power for specific subgroup analysis. Additionally, there is substantial possibility for birth cohort effects due to the trends in risk factors and development of medical therapies for prevention of T2DM and CVD.
While the individuals who have been enrolled in MULTITUDE studies the longest (FHS original cohort) have had their health and lifestyle monitored for almost 70 years and are in their 90s or 100s, this cohort was made up exclusively of Caucasians. Long-term follow-up studies from our consortium that enrolled minorities only began tracking T2DM and CVD events in the 1970s or later. It is likely that full exploration of causal mediators leading to T2DM and its related outcomes among non-whites has only recently become possible now that many participants have reached an age when incident T2DM and CVD events are just beginning to occur. MULTITUDE investigators will continue to update and expand the dataset to increase the representation of minority groups in the consortium.
The authors thank Mrithyunjay A. Vyliparambil of the University of Massachusetts Lowell, Massachusetts, for his contributions to the data crosswalk.
ECP and YZ contributed equally.
Contributors YZ, ECP and BK obtained the data. YZ performed the analysis. ECP drafted the manuscript. ECP, YZ, CMDO, SM, OK, LLM, FL, VSR, BEC and BK approved the manuscript, made significant contributions to the study and have read and approved the final version of the manuscript.
Funding The Evans Research Foundation at Boston University School of Medicine provided funding for the study. BK, ECP and YZ are supported by Evans Research Foundation. SM is supported by the Reproductive Scientist Development Grant (K12 HD000849); BEC is supported by the Boston Nutrition Obesity Research Center NIH Grant (P30 DK46200). OK is funded by a professorship grant from the Swiss National Science Foundation (no: 163878). FL is supported by the National Natural Science Foundation of China (No. 11501587)
Competing interests None declared.
Patient consent Not required.
Ethics approval Boston University IRB.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement Details regarding access to the MULTITUDE consortium data for research purposes are available on our website. We are unable to redistribute the data due to data use agreement restrictions. However, all the individual cohorts are available on BioLINCC. Enquiries regarding use of the MULTITUDE consortium for specific research studies are welcomed in the form of a project request. For further information, please contact Bindu Kalesan at the Center for Clinical Translational Epidemiology and Comparative Effectiveness Research, Boston University School of Medicine (email@example.com).
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.