Studying trajectories of multimorbidity: a systematic scoping review of longitudinal approaches and evidence

Objectives Multimorbidity—the co-occurrence of at least two chronic diseases in an individual—is an important public health challenge in ageing societies. The vast majority of multimorbidity research takes a cross-sectional approach, but longitudinal approaches to understanding multimorbidity are an emerging research area, being encouraged by multiple funders. To support development in this research area, the aim of this study is to scope the methodological approaches and substantive findings of studies that have investigated longitudinal multimorbidity trajectories. Design We conducted a systematic search for relevant studies in four online databases (Medline, Scopus, Web of Science and Embase) in May 2020 using predefined search terms and inclusion and exclusion criteria. The search was complemented by searching reference lists of relevant papers. From the selected studies, we systematically extracted data on study methodology and findings and summarised them in a narrative synthesis. Results We identified 35 studies investigating multimorbidity longitudinally, all published in the last decade, and predominantly in high-income countries from the Global North. Longitudinal approaches employed included constructing change variables, multilevel regression analysis (eg, growth curve modelling), longitudinal group-based methodologies (eg, latent class modelling), analysing disease transitions and visualisation techniques. Commonly identified risk factors for multimorbidity onset and progression were older age, higher socioeconomic and area-level deprivation, overweight and poorer health behaviours. Conclusion The nascent research area employs a diverse range of longitudinal approaches that characterise accumulation and disease combinations and to a lesser extent disease sequencing and progression. Gaps include understanding the long-term, life course determinants of different multimorbidity trajectories, and doing so across diverse populations, including those from low-income and middle-income countries. This can provide a detailed picture of morbidity development, with important implications from a clinical and intervention perspective.


INTRODUCTION
The term multimorbidity is used to define the co-occurrence of multiple diseases, specifically two or more chronic conditions within the same individual. 1 2 Multimorbidity represents a huge immediate and future challenge for healthcare systems around the world. It is estimated that 50 million people suffer from multimorbidity in the European Union, and about one in three globally have multiple conditions. 3 4 The global prevalence of multimorbidity is expected to increase through the 21st century, as a result of increased life expectancy, population ageing and the expansion of morbidity. For example, the prevalence of 'complex multimorbidity'defined as four or more co-occurring chronic conditions-has been projected to increase from about 10% in 2015 to 17% in 2035 in England. 5 The implications of this for individuals and societies are stark: multimorbidity is predictive of poorer quality of life, 6 greater functional decline 7 and increased mortality. 8 Management and treatment of multimorbidity also places a considerable economic and logistical burden on health services, 9 which are not adapted to deal with multimorbidity, being typically organised around the single disease model.

Strengths and limitations of this study
► This is the first systematic review to focus on studies that take a longitudinal, rather than cross-sectional, approach to multimorbidity. ► Systematic searches of online academic databases were performed using predefined search terms, as well as searching of reference lists, and this is reported using Preferred Reporting Items for Systematic Reviews and Meta-Analyses for scoping reviews guidelines. ► For selected papers, data were double extracted using standardised pro formas to aid narrative synthesis. ► Due to the heterogeneity of the studies included, their weaknesses were described in the narrative synthesis, but we did not perform quality assessment using standardised tools.

Open access
In response to this challenge, in the last two decades, there has been an explosion of (predominantly crosssectional) research that has investigated the risk factors and patterns of multimorbidity. For example, systematic reviews have identified common clusters of diseases, [10][11][12] which include cardiovascular and metabolic diseases, mental health conditions and musculoskeletal disorders. Common risk factors for multimorbidity include increasing age and low socioeconomic status (SES) 12 13 and poor health behaviours, such as high body mass index and smoking. 14 However, the vast majority of multimorbidity studies apply a cross-sectional approach; longitudinal approaches are scarce. To date, there are more than 70 published systematic reviews about multimorbidity, covering definitions to interventions (eg, refs 2 15), and none of these focuses on longitudinal studies. While 'snap-shot' analyses are useful for understanding prevalence and clustering of diseases, they provide little information on multimorbidity development over time and sequencing of diseases, which have important implications from a clinical and intervention perspective. Recently, there has been a growing orientation towards longitudinal approaches by academic communities and funders such as the UK's Academy of Medical Sciences. 4 Therefore, this paper aims to gain an overview of the longitudinal approaches used in multimorbidity research, to better understand what evidence is generated from these approaches and to identify the associated gaps.
Our research questions are: 1. What type and range of longitudinal methods are used to analyse multimorbidity over time within individuals? 2. What are the risk/protective factors identified to be associated with individual multimorbidity trajectories? We used a scoping review approach to systematically review the emerging body of literature investigating multimorbidity trajectories. Based on a narrative synthesis focused on commonalities and differences, this review provides a methodological summary and a comprehensivereview of the evidence on factors affecting multimorbidity pathways.

METHODS
We review the literature on longitudinal multimorbidity studies via a scoping review approach rather than using a systematic review or meta-analytic approach. 16 Scoping reviews are adopted when the purpose of the review is to scope a nascent body of literature and appraise gaps. 17 18 In reporting, we follow the recently developed Preferred Reporting Items for Systematic Reviews and Meta-Analyses for scoping reviews (PRISMA-ScR) 19 (online supplemental appendix A).

Eligibility criteria
Inclusion and exclusion criterion were defined prior to database searches (table 1). A primary eligibility criterion was to measure multimorbidity longitudinally within the same sample of adults using a quantitative approach, and we excluded cross-sectional or qualitative designs, reviews, meta-analyses and commentary that did not contain empirical results. Studies had to measure multimorbidity through recognised diseases/conditions or a defined multimorbidity measure such as the Charlson or Elixhauser comorbidity indices 20 21 but not solely a collection of symptoms/states (such as disability or frailty) or disease risk factors (such as obesity). Studies were required to measure change in multimorbidity between distinguishable diseases rather than progression within a single disease category (eg, different types of cancer). We also excluded studies that examined transitions from an index disease into a secondary disease (eg, comorbidities of diabetes). Finally, included studies were focused on  (table 2). The final search was a combination of three search elements: first, the concept of multimorbidity, second the methodological approach of disease trajectories and third, longitudinal study design. These search terms initially returned a large number of irrelevant references, focusing on cellular medicine, genetics and COVID-19, so we added an additional condition to exclude these. We also refined the search results to include English language, adult humans and peer-reviewed journal articles only. All searches were conducted in May 2020. The full search syntax is included in online supplemental appendix 1, appendix B. We identified additional relevant papers through recommendations from coauthors and external collaborators. The database search results were searched for these additional papers, and if they were not identified in the database searches, they were included as 'identified through other sources' and were subject to the same screening procedure as papers identified through database searches.

Screening and study selection
After deduplication, articles were screened for eligibility by title, abstract and finally full text using Endnote and predefined groups for exclusion reasons and inclusion (work shared between GC, CTM and KK). At abstract and full-text stages, a double screening process was used to minimise evidence selection bias, 22 meaning two coauthors blindly and independently reviewed the study for inclusion. Any disagreements were resolved through discussion and consensus. The reference lists of the selected studies were screened to identify any relevant studies that may have been missed in the main search, and any newly identified articles were subject to the same screening and data extraction processes.

Data extraction and synthesis
Three authors (GC, CTM and KK) extracted and doubleextracted information on study and sample characteristics, including the title, authors and publication year, study setting, data source used, information on the study population (eg, inclusion and exclusion criteria, sample size and age) and follow-up duration. We also extracted study objectives, multimorbidity conceptualisation and measures, and methodological and analytical approaches, focusing on those specifically used for the analysis of multimorbidity trajectories. Finally, we extracted the key substantive findings and limitations reported in each study in relation to generalisability, accuracy, comprehensiveness, methodology and interpretation.
To develop the narrative synthesis, we analysed and summarised the patterns in the extracted data, investigated the similarities and differences between studies and examined bias and limitations to identify knowledge gaps and the strengths and weaknesses of methodological approaches.

Ethics approval
This is a review of already published material; therefore, no ethics approval needed.

Patient and public involvement
No patient involved. Figure 1 depicts the study selection process. Database searches returned 11 420 articles and nine additional papers were identified from other sources. Of the combined 11 429 papers, 4705 were duplicate references and removed. Of the remaining 6724 papers, 6315 were removed during title screening and a further 360 papers during abstract screening. The most common reasons for exclusion were studies that did not focus on multimorbidity longitudinally (eg, trajectories were followed within a single disease) and study design not being longitudinal (eg, cross-sectional analysis). The remaining 49 papers went through full-text screening and 19 were Open access subsequently removed. Searching the reference lists of the remaining 30 papers identified another 11 potentially relevant papers. After screening these 11 papers, six were excluded leaving five additional papers for inclusion. In total, 35 papers were selected for further data extraction.
Sample characteristics varied widely, ranging from 756 in a survey of older participants 45 to 6.2 million in a nationwide study using Danish register data, 38 and the length of follow-up periods ranged between 2 and 20 years. Most studies had age restrictions, with about half focused on older populations (50 years+), and one study focused on the very old (80 years plus). 36 Most samples included both males and females apart from two studies including males only 26 37 and three studies including females only. 42 47 51 Three studies focused on US veterans, a predominantly male population. 39 44 48 The data sources used were a combination of administrative data, including primary or secondary care records, disease registries and health insurance data (24 studies), and survey data (18 studies).

Open access
There were common datasets used across studies, such as the Swedish National study on Ageing and Care in Kungsholmen. 27 31-33 In seven studies, survey data were combined with administrative data sources. 27 28 30-33 43 In five survey-based studies, questionnaire data were supplemented by medical examination records or cognitive or laboratory tests. 29 34 45 52 57 Informed consent of participants was mentioned in 10 of the 18 studies using survey data.
Methods of disease and multimorbidity ascertainment Studies based on administrative data relied on cliniciandiagnosed diseases, often using standardised diagnosis codes, such as the International Classification of Diseases, that is, International Classification of Disease 9th revision (ICD-9), ICD-9, Clinical Modification (ICD-9-CM), or ICD 10th revision (ICD-10). All survey-based studies used participant self-report for disease identification. Studies that combined survey data with other sources ascertained disease status mostly through clinician diagnosis 27 28 30-33 43 but some supplemented with laboratory and cognitive tests. 29 34 45 52 57 The number of diseases that were considered to contribute to the measure of multimorbidity varied widely, ranging from three 30 41 42 to a very large number based on three levels of ICD-10 codes. 24 38 Studies using survey data used a narrower range of diseases than those drawn from administrative data. The precise list of diseases was never uniform between studies (see appendix C for full details), but the rationale for choosing them was usually described. For example, included diseases with high prevalence and risk of disability and mortality 34 45 or that were assessed/validated by clinicians. 27 28 31-33 Some used lists based on the Charlson and Elixhauser multimorbidity indices, 26 35-37 43 46 56 but these were sometimes augmented with extra conditions 37 or reduced due to data sensitivity restrictions. 35 Approaches to the measurement of multimorbidity trajectories To develop longitudinal measures of multimorbidity, studies tended to take one of two broad approaches. The most common was that repeated measures of multimorbidity status over time were measured for each individual. This mainly involved constructing unweighted or weighted counts of diseases at regular intervals for each individual, thus conceptualising multimorbidity as a continuum (eg, refs 52 57), although a few still used a binary measure of two or more chronic conditions. 29 40 The second broad approach explored disease transitions. 24

Types of methodology
Breaking this down further, we identified five broad analytical approaches: constructed variables of multimorbidity change, multilevel regression modelling, transition and data mining methodologies, visual approaches (articles summarised in table 4) and longitudinal group-based methodologies (articles summarised in table 5). Note that some studies employed more than one type of approach (eg, refs 42). In the first approach, four articles created variables of multimorbidity change. 29 35 53 57 In one study, intraindividual change in Charlson Comorbidity Index (CCI) between baseline and later time points was used, 35 and in another, transitions to two or more conditions or by acquisition of additional conditions. 29 Two studies used simple methods to construct morbidity trajectory groups (eg, 'constant high', 'constant medium' and 'constant low') 53 and disease transition stages (eg, 'healthy' and 'healthy to a single chronic disease'). 57 After creating these categorical dependent variables, the authors used them in regression analysis to assess their association with health expenditures, 53 diet 57 and physical activity and functioning. 29 The next approach, employed by 14 studies (table 4), 27 31- 36 40 42 43 45 49 50 52 was multilevel regression modelling (variously referred to as random effects models, growth curve models, hierarchical linear models or multilevel models). These studies analyse repeated measures of multimorbidity within each individual, considering this as a 'trajectory' or 'growth curve'. The dependent variable was typically a count of diseases or a multimorbidity index measured repeatedly, and the coefficients assessed a change in this over time, many including random effects for both the intercept and slope. One study used the regression estimates (ie, intercept and slope coefficients for multimorbidity) to create categories capturing the pace of multimorbidity, for example, 'rapidly accumulating' and 'slowly accumulating', which were used for further modelling. 31 Some of these studies also investigated whether certain covariates such as biomarkers, 27 34 sociodemographics and life experiences 33 affected the pace of change in multimorbidity by including an interaction term between time and the respective covariates.
The next approach, employed by nine studies, 24-26 30 38 39 41 44 55 focused on modelling transitions between specific disease states. Six of the studies focused on a limited number of diseases to make the analysis feasible. 25 30 39 41 44 55 Some studies used principles of state transition modelling, either using Markov principles, 44 acyclic multistate models 41 or state transition modelling. 26 30 Another two studies employed Bayesian techniques including a multilevel temporal Bayesian network 25 and a longest path algorithm to identify the most probable sequence from/to a specific disease following an unsupervised multilevel temporal Bayesian network analysis. 39 One paper derived a disease progression network from real data and used this for further microsimulation. 55 Finally, two studies used a data-driven approach to create 'temporal disease trajectories' by combining significant temporal directed pairs from all disease pairs possible. 24 38 Transition analysis also enabled the identification of longitudinal clusters. 44 Three studies 23 42 51 used visual methods to describe disease sequences or multimorbidity acquisition sequences. Ashworth et al 23 used alluvial plots to illustrate multimorbidity acquisition sequences based on date Open access Table 4 Methods of studies taking four analytical approaches (multimorbidity change variables, regression, transition modelling and visual approaches) Constructed measures of multimorbidity change (from zero or one condition at baseline to two or more conditions) and of worsening of multimorbidity (from multimorbidity at baseline to additional conditions at follow-up) Fraccaro et al, 2016 35 All-cause mortality Multimorbidity change was measured by differences between baseline CCI and 1-year, 5year and 10-year follow-up CCI scores as a proportion of mortality rate. Survival analysis (Cox regression) estimated mortality rates as a function of age, gender and CCI scores (fixed and time varying).
Gellert et al, 2018 36 Level of multimorbidity (count of comorbiditiesbased on Elixhauser) Linear mixed models (random intercepts and slopes) estimated differential increase in the number of comorbidities over 25 calendar quarters prior to death in centenarian, nonagenarian and octogenarian cohorts.
Perez et al, 2020 27 Level of multimorbidity (count of chronic conditions) Linear mixed models (random intercepts and slopes) were employed to analyse the association between baseline total serum glutathione levels and level of multimorbidity.
Quñones et al, 2011 50 Level of multimorbidity (count of chronic conditions) Linear mixed models (random intercepts and slopes) analysed ethnic variations in level of multimorbidity.
Quñones et al, 2019 49 Level of multimorbidity (count of chronic conditions) Negative binomial generalised estimating equation (GEE) models with a first-order autoregressive covariance structure were used to assess the relationship between chronic disease accumulation and race/ethnicity. Repeated measures logistic regression using GEEs were used to identify risk factors for developing three conditions and their combinations. Generalised linear mixed models were used to estimate the associations between predictors and the progression to multimorbidity.

Continued
Open access  Open access of disease onset, and although useful to understand the order of diseases (co-)occurrence, the visualisations are unable to account for the pace of multimorbidity progression. Aalen-Johansen curves (a multistate generalisation of cumulative incidence curves) were used to represent the accumulation of multimorbidity graphically 51 and the Sankey diagram to show the longitudinal progression and transitions to each disease and disease combinations. 42 The final approach, employed by seven studies, 28 37 46-48 54 56 was to construct meaningful categories of longitudinal multimorbidity patterns (and associate these with other covariates (summarised in table 5). Methodologies included latent class analysis, latent class growth analysis, growth mixture modelling or group-based trajectory modelling and typically identified between four and six groups of distinct longitudinal multimorbidity patterns. Two studies took an associative approach to explore that specific diseases cluster longitudinally. 48 54 For example, Hsu 54 found four trajectory groups: 'low risk', 'cardiovascular risk only', 'gastrointestinal and chronic non-specific lung disease' and 'multiple risks'. The other five studies focused on stages of accumulation. Hiyoshi et al 37 found four trajectory groups ranging from 'a constant low trajectory' to 'a high start and a slow increase trajectory'. Generally, these clusters incorporated data on the initial level of multimorbidity, and accumulation pattern over time, and nearly all showed accumulation (the exception being Kim et al, 56 which identified some groups with decreasing morbidities).

Results of the studies: outcomes and risk factors Prediction of other health outcomes
Seven of the studies used multimorbidity trajectories to predict subsequent health outcomes, 31 35 43 45 53 55 56 including self-reported health, cognitive ability, disability, medical utilisation and mortality (table 6). Among older adults, results showed that an increase in multimorbidity over 10 years was associated with worse reported health 43 and that those who developed multimorbidity faster had greater risk of disability. 31 In one study, changes in multimorbidity were found to be more predictive of mortality than baseline multimorbidity. 35 By contrast, another Finite mixture modelling (Proc TRAJ in SAS) was used, with a zero-inflated distribution. Optimal number of groups determined by Bayesian Information Criterion (BIC).
Six groups: 'robust' (no conditions), 'initiates' (none at baseline, increase over time), 'slow initiates' (some at baseline and gradual increase over time), 'accelerated initiates' (none at baseline and quick increase followed by deceleration), 'chronic low' (steady comorbidity over time), 'ailing' (moderate levels of comorbidity at baseline and steady increase over time), 'frail' (high comorbidity at baseline, remaining high over time). Hiyoshi et al, 2017 37 Group-based trajectory modelling, using a zero-inflated distribution. Optimal number of groups determined using BIC.
Four groups identified: 'a constant low trajectory', 'a low start and an acute increase trajectory', 'medium start and a slow increase trajectory' and 'a high start and a slow increase trajectory'. Hsu, 2015 54 Multiple group-based trajectory model (Proc TRAJ in SAS). Morbidity was set to follow a logistic model. The optimal group number was determined using the BIC and parsimony principle.
Four chronic disease trajectories were identified: 'low risk', 'cardiovascular risk only', 'gastrointestinal and chronic nonspecific lung disease' and 'multiple risks'. Five groups identified and validated: 'no recorded chronic problems', 'developed a first chronic morbidity over 3 years', 'a developing multimorbidity group', 'increasing number of chronic morbidities' and 'a multichronic group with many chronic morbidities'.
Open access study confirmed that a change in CCI predicts mortality but not necessarily better than a cross-sectional estimate of multimorbidity. 43 Zhu et al 55 (table 7). Increasing age, although often accounted for in analyses, emerged as a dominant risk factor for acquisition, worsening or progression of multimorbidity. 29 42 As expected, younger age groups were more likely to belong to a non-chronic healthier cluster. 28 However, trajectories starting with depression were more prevalent in younger individuals. 23 Younger cohorts were also found to be more likely to develop multimorbidity and to do so at a younger age. 40 A few studies reported gender differences, with conflicting results. While one study found that those in the 'multiple risks' group were more likely to be female, 54 another two studies found that men were more likely to transition between disease states than women. 30 41 Four studies investigated ethnic variations. 23 41 49 50 In two US studies, compared with non-Hispanic whites, black Americans had a higher rate of multimorbidity at baseline along with a slower rate of disease accumulation over time, while Hispanic participants tended to start with fewer diseases and increase more rapidly. 49 50 Different ethnicities also had different disease transition patterns. In the USA, white individuals were more likely to transition from ischaemic heart disease to death, while Asian and Native Hawaiian and Pacific Islander individuals were more likely to transition from diabetes to diabetes plus chronic kidney disease. 41 In the UK, diseasespecific sequences also differed by ethnicity: for example, the white ethnic group was dominated by depression as a starting point, while diabetes was the most common starting point in the black ethnic group. 23 The studies also explored a range of sociodemographic determinants including area-based deprivation, education, occupation, income and marital status. Results largely confirm those found with cross-sectional analyses, with lower SES associated with worse multimorbidity trajectories. For example, lower levels of education were associated with higher rate of multimorbidity accumulation 33 42 or worse multimorbidity trajectories. 47 People living in more deprived areas were more likely to be in an evolving or multichronic multimorbidity cluster 28 and to have trajectories with diabetes and depression as the most common starting point. 23 Health and health behaviours also showed associations. A Chinese study showed that a greater consumption of fruits, vegetables and grain slowed the development of multimorbidity. 57 Alcohol consumption, smoking and physical inactivity were associated with worse multimorbidity trajectory patterns. 30 42 47 Physical function (measured by gait speed and grip strength at baseline) was associated with development and worsening of multimorbidity over 2 years in a sample of adults aged 50 years and over. 29 Being overweight or obese was also associated with developing or worsening multimorbidity trajectory. 29 42 47 Two studies investigated the role of specific biomarkers, finding that chronic inflammation, system dysregulation and multisystem failure are associated with Accumulation of multimorbidity was associated with faster decline in verbal fluency but seems to have no effect on memory decline, in older adults without mild cognitive impairment or dementia.
Kim et al, 2018 56 Mortality The 'consistently high' multimorbidity trajectory group had the highest risk of mortality at 1-year, 3-year and 5-year follow-ups.
Zeng et al, 2014 43 Self-reported health, number of primary care visits, inpatient and emergency admissions and mortality Growth curve models gave marginally better fitting models for the outcomes of self-reported general health status, but mortality and inpatient status was best predicted by multimorbidity snapshot prevalence the year before the survey.
Zhu et al, 2018 55 Life expectancy Diabetes, plus hypertension plus complications reduced life expectancy the most. The earlier the onset of multimorbidity, the greater the reduction in life.

Birth cohort
In each succeeding cohort, multimorbidity rates was higher and multimorbidity emerged earlier. Differences persisted independently of the risk factors for multimorbidity and period effect.
Dekhtyar et al, 2019 33 Elementary education (early adulthood), lifelong active occupation (midadulthood), social network (later life) Adults over 60 years old with higher than elementary education, lifelong active occupations and richer social networks had slower multimorbidity accumulation. The association between childhood circumstances and multimorbidity accumulation was attenuated by subsequent (mid and late) life experiences. Rich social networks reduced the speed of disease accumulation irrespective of lifelong job stress and level of education.
Fabbri et al, 2015 34 Biomarkers: IL-6, IL-1ra, TNF-α receptor II, and DHEAS (as a marker of chronic inflammation and system dysregulation) Multimorbidity development with age was not linear, and significantly accelerated at older ages. Higher IL-6, IL-1ra and TNF-α receptor II and low DHEAS were associated with higher multimorbidity at baseline, independent of age, sex, BMI and education. Higher IL-6 and steeper increase in IL-6 predicted an accelerated rise in multimorbidity over 9 years of follow-up. Healthy lifestyle habits were strongly associated with lower incident multimorbidity of cancer and cardiometabolic diseases The risk of transitioning to multimorbidity after having developed a first of the three chronic diseases was higher in men than in women.
Hanson et al, 2015 46 Parity, timing of childbearing, birth outcomes of offspring High parity, early childbearing and adverse offspring birth outcomes are associated with particular later-life comorbidity patterns and trajectories, when controlling for early-life conditions (age at parental death, childhood socioeconomic status, familial excess longevity and religious participation).
Hiyoshi et al, 2017 37 Income and marital status Income and physical, cognitive and psychological function were associated with trajectory group membership in unadjusted analysis but not in fully adjusted analysis.
Hsu, 2015 54 Gender, education, physical function, depressive symptoms, life satisfaction, number of health examination, smoking and drinking.
Those in the 'multiple risks' group were more likely to be female, less educated, with more physical function difficulties, more depressive symptoms, lower life satisfaction, more health examinations and not to smoke or drink. Members in the 'CVD risk only' and 'multiple risks' groups were more likely to have physical function difficulties and depressive symptoms.
Jackson et al, 2015 47 Overweight or obesity, education, difficulty managing income, smoking alcohol consumption and physical activity Being overweight or obese, having a lower education level and difficulty managing on income associated with belonging to an accumulation trajectory. Smoking, alcohol intake and physical activity level also appeared to be important risk factors for the development of some trajectories.

Open access
faster rate of multimorbidity accumulation. 27 34 There were associations with family factors: being married was found to be protective of greater multimorbidity accumulation, 37 54 and young parenthood (younger than 25 years) and extremely high parity (nine of more births) significant risk factors. 46 Finally, a negative attitude towards life and health such as low life satisfaction and negative health outlook was associated with poorer multimorbidity trajectories. 32

DISCUSSION
Understanding longitudinal multimorbidity trajectories is an important public health priority for clinicians, academics and funders alike. 4 This review aimed to take a systematic approach to scope existing research in the field with a focus on summarising commonly used methodological approaches and substantive findings. In doing so, we provide, to our knowledge, the first review to address longitudinal studies of multimorbidity, in a field Study, year Risk factors Findings of association analysis Lappenschaar et al, 2013 25 Urbanisation, multimorbidity at baseline Urbanisation level of a general practice is associated with the higher cumulative incidence of chronic cardiovascular conditions, in particular obesity, hypertension, dyslipidaemia, diabetes mellitus and ischaemic heart disease. Disease accumulation rate higher when multimorbidity is already present at baseline.
Perez et al, 2020 27 Total serum glutathione (biomarker of multisystem failure) Lower baseline levels of total serum glutathione were associated with a higher rate of multimorbidity development, independent of covariates.
Quñones et al, 2011 50 Race/ethnicity White Americans differ from black and Mexican Americans in terms of level and rate of change of multimorbidity. Mexican Americans demonstrate lower initial levels and slower accumulation of comorbidities relative to white American. In contrast, black Americans showed an elevated level of multimorbidity throughout the 11year period of observation, although their rate of change slowed relative to white Americans.
Quinones et al, 2019 49 Race/ethnicity Non-Hispanic black respondents had higher initial chronic disease counts, but slower accumulation rates, than non-Hispanic white respondents. Hispanic respondents had lower initial chronic disease counts but faster accumulation than non-Hispanic white respondents. Age, sex and race/ ethnicity Men were more likely to transition between states than women. Whites had the highest risk of transitioning from ischaemic heart disease to death. Asians and Native Hawaiian and Pacific Islanders were more likely to transition from diabetes to diabetes and chronic kidney disease.
Strauss et al, 2014 28 Age and deprivation Younger age groups were more likely to be in the non-chronic cluster than older groups. Females were more likely to develop or start with multimorbidity than males. More deprived individuals were more likely to be in the evolving (rather than static) multimorbidity cluster. Xu et al, 2018 42 Sex, age, marital status, income, education, obesity, physical activity, smoking and immigrant status.
Odds of multimorbidity progression increased over time and with age. Women with stroke were more likely to progress to another disease and become multimorbid than other baseline characteristics. In adjusted models, accumulation of multimorbidity was associated with non-married status, low income, lower education, obesity, sedentary and smoking, and immigrant status. Obesity differently associated with different sequences.
BMI, body mass index; CVD, cardiovascular disease; IL-6, interleukin 6; TNF-α, tumour necrosis factor. Open access saturated by cross-sectional research. 2 12 15 58 A strength of this review is the systematic and robust approach taken to searching and screening articles for inclusion and reviewing the selected studies, which should limit selection and extraction bias. We used predefined search terms, inclusion criteria and data extraction tools, and we engaged in double screening and extraction. 22 The scoping review process meant that we summarised a wide variety of evidence, and therefore, it was not possible to perform a meta-analysis or use a standardised critical appraisal tool. Nevertheless, we provide a narrative-style critical summary of the selected articles. The results demonstrate that despite widespread expressed interest, relatively few studies do take a longitudinal approach to multimorbidity. All the studies included were published within the last decade and the vast majority using data from high-income countries. The studies showed a great variability in sampling strategy, ways of measuring multimorbidity and statistical approaches to characterising multimorbidity longitudinally. Methods for identifying longitudinal patterns ranged from counts of diseases to cluster or group-based analyses, to modelling transitions between diseases or disease sequences, and these were differentially useful for modelling accumulation, sequencing, clusters or transitions. From a substantive perspective, the studies showed associations with adverse outcomes such as worse reported health, greater risk of disability and mortality that we might expect based on the existing cross-sectional research. A range of multimorbidity trajectory risk factors were also identified, including sociodemographic factors, health behaviours, physical function, biomarkers, marriage and fertility factors, and attitudinal factors. A limitation of narrative reviews is that they might select evidence to support a particular stance and do not necessarily take enough steps to eliminate selection bias. However, we selected a comprehensive set of items to extract before starting the review, and we engaged in double screening and extraction. Therefore, our methodological approach should limit selection and extraction bias. Our review did not engage in a critical appraisal of the quality of the selected studies. However, when the aim of a scoping review is to provide an overview of evidence (as ours was), methodological limitations and risk of bias of the evidence are not necessarily relevant and generally not performed. 18 The review has highlighted some geographical bias in the distribution of multimorbidity research. In particular, there was an under-representation of longitudinal multimorbidity research in low-income and middle-income countries (LMICs), which likely reflects the geographical focus of multimorbidity research more generally. 59 This may be due to underinvestment in multimorbidity research in LMICs, coupled with challenges of collecting or accessing relevant data. For example, most of the selected studies used electronic medical records or largescale longitudinal surveys, which are rare in developing countries. Nevertheless, due to the population ageing trends in LMICs, multimorbidity is already a major public health issue, with potentially more complex comorbidity patterns (eg, undiagnosed conditions or interactions with infectious disease), which deserve research using a longitudinal approach. Recently, published work in LMICs countries has tended to employ a cross-sectional design to analyse multimorbidity 60-63 and therefore were not eligible for inclusion in this review. In addition, none of the studies in the review made cross-country comparisons, which may help to generate stronger evidence about disease trajectories and mechanisms involved in multimorbidity development and progression. For example, comparable cross-country patterns may suggest common biological mechanisms, whereas divergent findings could suggest moderation or prevention of disease processes by policy approaches to treatment, healthcare settings and institutional structures.
The selected studies used a great variety of data sources including administrative data (primary and secondary care data, health insurance claim data, patient and disease registries) and survey data, leading to variations in sample size and issues of generalisability. Issues of small sample size were only discussed in a limited number of studies, mostly in relation to subgroups such as ethnic minorities. 23 41 49 50 Despite the use of large surveys or administrative data, the majority of studies expressed doubts about the generalisability of their findings. For example, welleducated and wealthy individuals were reported as overrepresented in longitudinal survey samples. 27 29 31-33 42 47 Studies using administrative data sources typically investigated multimorbidity based on complete follow-up and excluding those who died, generating immortal time bias and investigating potentially healthier populations. 39 53 In other studies, the choice of data sources themselves induced bias, for example, where samples were based on health service users. 35 48 Others explained their sample might be representative but only of a particular group in a specific region (eg, Utah 46 ). Another issue of generalisability, mentioned in previous reviews, 15 was related to the heterogenous multimorbidity measures used. 64 A wide variety of different diseases were included, and only a few studies used 'standard' measurement of multimorbidity like the Charlson 20 or Elixhauser 21 indices. Due to the diversity of data sources, diseases were ascertained in multiple ways, using clinical diagnosis, laboratory results, medication use and self-report. The only common measurement feature was that studies in this review tended not to define multimorbidity as the presence or two or more diseases.
The choice of statistical methods served to highlight or obscure different aspects of multimorbidity. For example, the most common approach, multilevel or single-level regression modelling, emphasises accumulation, providing the opportunity to simultaneously evaluate the baseline level of multimorbidity and the (slope) change in multimorbidity and how this differs between groups with different characteristics. However, it tends to obscure the role of specific diseases by collapsing all Open access morbidity in a single count or index, and we cannot tell, for example, whether this faster accumulation is predominantly occurring among certain types of disorders. Complementary to regression approaches, groupedbased methodologies aimed to classify individuals into types of multimorbidity accumulation. A minority of studies employed the cluster-based approach to understand how specific diseases co-occur over time, 48 54 which extends cross-sectional approaches often referred to as associative multimorbidity. 11 This has the advantage of providing a more detailed understanding of the constellation of diseases that contribute to distinct trajectories, but due to the rarity of some diseases, will tend to find only highly prevalent clusters and is not suitable for rarer disease trajectories.
Some studies conceptualised longitudinal multimorbidity as transitioning between different disease states, using either structured Markov frameworks, 44 multistate modelling 26 41 or a more data-driven, unsupervised approaches. 24 38 The former, more structured approach to disease transition tended to provide a very detailed understanding of interactions between a small set of diseases, which can provide useful evidence for targeting prevention at those with the first disease, a risk stratification approach. The latter, data-driven approaches provide very comprehensive evidence for population-based strategies but relies on large datasets collected over a number of years and appropriate clinical expertise to interpret the results of patterns identified through artificial intelligence (AI). Given the growing interdisciplinary collaborations between epidemiology and computer science, data-driven research will continue to expand in the coming years and extend to prediction modelling and projections. One of the strengths of computer science, and the recent new developments in AI with machine learning, is the ability to work towards solutions that can combine prediction models and compare different treatment options for cohorts of patients (eg, what is the likelihood that a medication commonly used for one chronic condition may speed up the progression of another condition or lead to the development of a new condition).
Compared with cross-sectional studies, longitudinal approaches provide more detailed insight about the role of specific risk factors. For example, while age is a known risk factor, this review highlights how older individuals, once multimorbid, show acceleration of multimorbidity. 29 Multimorbidity trajectory patterns varied by ethnicity, 23 41 49 50 marital status, 37 42 educational level and area-level deprivation, 28 33 42 47 confirming some patterns observed in cross-sectional data. A useful exploitation of longitudinal data-not included in these studies-would be to explore how change in risk factors such as SES or marital status influences different multimorbidity trajectories, which may help identify at-risk groups and target prevention strategies. As highlighted by Zhu and colleagues, 55 the earlier the multimorbidity onset in the life course, the greater the life year lost for that individual. Therefore, future research should seek to take a life course approach in order to disentangle early preventable factors of multimorbidity onset but also to determine later life factors influencing additional disease accumulation. Risk factors should be considered at the level of the individual (life course and contemporaneous factors), medication use and the wider social environment, including poor environmental conditions, and interaction with institutional structures (eg, healthcare system organisation). The increasing availability of 'big data', which links longitudinal administrative data on individuals with health, and geospatial data will make these holistic approaches technically possible. Future research should focus on generating the knowledge required to develop interventions aimed at preventing both the onset and the worsening of multimorbidity.

CONCLUSION
This review identifies a small but developing body of literature attempting to describe multimorbidity longitudinally. There was a notable lack of studies in LMICs, as well exploring minority ethnic groups. A wide variety of complementary methods are employed, emphasising factors associated with greater disease accumulation, speed of accumulation and specific disease transition processes. Methodologies based on disease ordering or sequence was seldom explored by the studies, and while it is challenging to identify exact timing of disease, future research could seek to investigate disease sequencing that underlies the accumulation process. Risk factors for trajectory types could inform future intervention and prevention strategies at critical life course periods and disease progression turning points. Initiatives to enable researchers greater access to relevant data sources, such as the HDR UK initiative to harmonise datasets for multimorbidity research, is crucial and should become more generalised in order to gain the insight on multimorbidity processes required to feed into prevention and policy makers strategies at a global scale.
Data availability statement Data sharing not applicable as no datasets generated and/or analysed for this study.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.
Open access This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https:// creativecommons. org/ licenses/ by/ 4. 0/.