Diagnosed prevalence of Ehlers-Danlos syndrome and hypermobility spectrum disorder in Wales, UK: a national electronic cohort study and case–control comparison

Objectives To describe the epidemiology of diagnosed hypermobility spectrum disorder (HSD) and Ehlers-Danlos syndromes (EDS) using linked electronic medical records. To examine whether these conditions remain rare and primarily affect the musculoskeletal system. Design Nationwide linked electronic cohort and nested case–control study. Setting Routinely collected data from primary care and hospital admissions in Wales, UK. Participants People within the primary care or hospital data systems with a coded diagnosis of EDS or joint hypermobility syndrome (JHS) between 1 July 1990 and 30 June 2017. Main outcome measures Combined prevalence of JHS and EDS in Wales. Additional diagnosis and prescription data in those diagnosed with EDS or JHS compared with matched controls. Results We found 6021 individuals (men: 30%, women: 70%) with a diagnostic code of either EDS or JHS. This gives a diagnosed point prevalence of 194.2 per 100 000 in 2016/2017 or roughly 10 cases in a practice of 5000 patients. There was a pronounced gender difference of 8.5 years (95% CI: 7.70 to 9.22) in the mean age at diagnosis. EDS or JHS was not only associated with high odds for other musculoskeletal diagnoses and drug prescriptions but also with significantly higher odds of a diagnosis in other disease categories (eg, mental health, nervous and digestive systems) and higher odds of a prescription in most disease categories (eg, gastrointestinal and cardiovascular drugs) within the 12 months before and after the first recorded diagnosis. Conclusions EDS and JHS (since March 2017 classified as EDS or HSD) have historically been considered rare diseases only affecting the musculoskeletal system and soft tissues. These data demonstrate that both these assertions should be reconsidered.

For many decades, studies have quoted a prevalence rate of 1 in 5000 for EDS, although the origin of this figure is unclear, seeming to appear first in a medical textbook 1 2 as an unreferenced "reasonable estimate". Thus, these syndromes have long been categorised as rare diseases, defined in the European Union as those affecting fewer than 50 in 100,000 people 3 . Kulas Søborg et al. 4 recently reported a prevalence of 20 per 100,000 for EDS in a nationwide Danish cohort based on secondary health care data up to 2012, but importantly, this data did not include patients who had received the considerably more common JHS diagnosis, now included in the latest revised classification. It is possible to extrapolate a combined population prevalence figure for JHS and EDS for Sweden 5 of around 120 per 100,000 from a study focussing on comorbid mental health issues, but no investigators have thus far set out to investigate the combined diagnosed prevalence of JHS/EDS within a population.
Although common features of these conditions are arthralgia, soft tissue injury and joint instability 6 , over the last two decades it has become clear that their clinical features are not limited to musculoskeletal and cutaneous involvement, but are multisystemic [7][8][9] . In the special edition of the American Journal of Medical Genetics dedicated to EDS in March 2017, papers covered links to cardiovascular autonomic 10 and gastrointestinal dysfunction 11 as well as psychiatric and neurodevelopmental disorders 5 12 . Chronic disabling fatigue 13 and pain syndromes 14 were also recognised as common and multifactorial issues.
Gynaecological 15 16 and obstetric 17 issues are also reported in this population. There is also an emerging link with the potentially life-threatening condition of Mast Cell Activation  18 19 . There is some emerging evidence hinting that nutritional deficiencies 20 21 may play a key role, both seeming to be more prevalent in these patients and possibly implicated in the development of some of the complications.
Early diagnosis is found to be crucial to patients 22 to enable the provision of appropriate treatment, as well as to prevent later onset complications 7 . Establishing the diagnosis of EDS/HSD is often problematic for patients, which interferes with the early detection, treatment and prevention of further escalations of recognised symptoms, disability and more elaborate complications. A mean of 14 years elapses between the first clinical manifestations and the actual diagnosis 23 . For 25% of patients this delay lasts over 28 years 23 . "A misdiagnosis was given to 56% of patients [resulting in] inappropriate treatment in 70% of the patients… For 86% of the patients, the delay in diagnosis was considered responsible for deleterious consequences." 23(p.137) It is possible that some of these difficulties arise from the widespread belief amongst clinicians that EDS is rare. It is therefore of clinical importance to establish better estimates of current prevalence. Conventional studies tend to be based in restricted clinical settings, such as rheumatology clinics, and are therefore limited by the number of recruited patients and biased by severity/type of patients referred. It has been shown that using linked health data is an economic and effective alternative to performing de novo longitudinal studies, including rare conditions 24 25 . We used routinely held data from primary and secondary care sources to examine the epidemiology of people with a diagnostic code for EDS/JHS in Wales. We then conducted a nested case-control study to study the number of diagnoses across all body/disease systems and prescription usage in order to test the widespread belief that these conditions are primarily musculoskeletal in nature, rather than multi-system disorders.  26 .

Cohort preparation
We identified Welsh residents with a Read Version 2 27 diagnostic code of EDS or JHS in primary care data or ICD-10 diagnostic codes 28 in secondary care data (hospital admissions) between 01/07/1990 (or the start of the dataset if later) and 30/06/2017. This date marks the end of maximum data coverage across all datasets. The EDS subclassification in Read Version 2 contains some, but not all, of the subtypes which were in use prior to 1997 and as a result, the reliability of any subtype data must be highly questionable (see Table 1). ICD-10 codes do not distinguish between any subtypes of EDS (see Table 1). Only ALF's with good matching status were included in the study, i.e. direct match on either NHS number or on surname, first name, postcode, date of birth and gender; or fuzzy matching with a probability of >= 90%.  8 We created one dataset for diagnoses in the GP data and another for diagnoses in the hospital data. Both datasets were linked to the week of birth, gender and date of death information in WDS on their ALF and then combined to create a cohort of people with EDS/JHS in either GP or hospital data, identifying any duplications and keeping the earliest diagnosis date for any individual appearing in both datasets.

Analysis
Data linkage and data preparation within the SAIL databank were conducted using IBM DB2 10.5 SQL. Data were then imported into R (Version 3.4.1) 29 , which was used for all statistical analyses. The mean age at first diagnosis between male and female subjects was compared and confidence intervals of the difference calculated.
The denominator of the diagnosed prevalence and incidence of EDS and JHS in secondary care was calculated based on the total number of individuals with recorded gender, registered and living in Wales between 01/07/1990 and 30/06/2017 for each full year of the study respectively. The prevalence and incidence in primary care denominator was further adjusted to include only people living in Wales and whose GP practice was contributing data to SAIL. The prevalence and incidence in primary and secondary care was then added together to create an overall estimate of the prevalence and incidence in Wales.

Case-Control Comparison
A nested case control method was used. Each case was matched to 4 controls with the same gender and similar age profiles (within 45 days of the week of birth). We implemented strict criteria for selection to the case-control cohort. Both cases and controls had to (a) have uninterrupted GP registrations for 1 year before and 1 year after the date of the relevant diagnosis (or died during follow-up); (b) be registered with a GP submitting data to SAIL either at the matching date or afterwards; (c) have been registered with a GP that consistently recorded data across their patient profile. The latter avoids diagnoses that were retrospectively entered for a time period when the GP practice did not fully implement the use of electronic records (less than 10% of the data they recorded during 2009). Although this reduced the number of cases and controls we were able to analyse, it avoids data quality bias, especially during the early years of this study, when GPs were converting to the use of computer systems and databases. Controls with any type of diagnosed hereditary connective tissue disorder were excluded. Preliminary analysis of the combined cohort indicated that adjustment for deprivation was not necessary (i.e. equal distribution of people across deprivation quintiles). We then calculated odds ratios between cases and controls using Read chapters. All results that affected at least 5 cases or 20 controls were visualised using forest plots.

Ethical approval
The study design uses anonymised data and therefore the need for ethical approval and participant consent was waived by the approving Institutional Review Board, the UK National Health Service Research Ethics Committee. The SAIL independent Information Governance Review Panel (IGRP) approved the study.

Patient and Public Involvement
Two of the authors of this paper have been diagnosed with symptomatic joint hypermobility disorders. This study used routinely collected data, we were not able to involve members of the public but will be disseminating our findings widely, including directly to patients via social media and through our links with patient organisations. (70%) of those diagnosed with EDS/JHS were female (see Figure 1).

EDS/JHS in Hospital Data
A total of 1,298 individuals were found in the hospital data of whom 970 (75%) were female: 745 (57%) had a diagnosis of JHS and 553 (43%) EDS (see Figure 1). 5,355 (89%) of the cases could be found in the primary care data with the remainder in the hospital cohort. Combining the results from primary and secondary care led to a cohort of 6,021 distinct individuals. 5,064 (84%) were coded with JHS and 957 (16%) with EDS. 4,244 (70%) of patients were female. The age at first diagnosis peaked in the age group 5-9 years for males and 15-19 years for females (see Figure 2). There was a significant difference of 8.5 years in the mean age of diagnosis between males and females (95% CI: 7.70 to 9.22): 9.6 years in EDS (95% CI: 6.85 to 12.31) and 8.3 years in JHS (95% CI: 7.58 to 9.11). 72% of males were diagnosed during childhood (age < 18 years) in contrast to only 41% of females.

Demographics of combined EDS/JHS cohort
2016/17 is the latest year for which we have complete data and could therefore derive prevalence. During this year, 2,668,902 people were registered with a GP in Wales submitting data to SAIL, of whom 4,598 had a diagnostic code of EDS/JHS which first appeared in the primary care data (172 in 100,000). A further 711 people out of the 3,239,153 registered with any GP in Wales during 2016/17 had an EDS/JHS diagnosis which first appears in secondary care data (22 in 100,000). There is an increasing rate of coded diagnoses throughout the period. Assuming that the GP data is representative of the whole of Wales this leads to a combined point prevalence of 194 in 100,000 at the end of the study period. This corresponds to about 10 cases in a practice of 5000 patients (see Figure   3). The incidence of EDS/JHS over this time period is shown in Supplement Figure 1. F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y   11 2,597 cases had good GP data coverage at the age of diagnosis and could be matched by age and gender with controls (see Figure 1). 1,340 cases (male: 561; female: 779) were first diagnosed before the age of 18 years and 1,254 cases (male: 229; female: 1,025) above this age. The people in the nested case-control cohort were slightly older than the overall cohort (data not shown here).

Factors associated with JHS/EDS
Looking at the time frame of 1 year either side of the first coded diagnosis of EDS/JHS amongst young people (age < 18 years) there were significantly more additional diagnoses in 16 out of 20 Read code disease categories compared with their controls (see Figure 4a).

Discussion
This work examined the epidemiology of EDS and JHS and found a combined diagnosed prevalence of 194.2 per 100,000 (0.19%), or 1 in 500 people in Wales; hEDS or HSD within the 2017 classification. We found a steadily increasing rate of diagnosis over the past 27 years (see Supplement Figure 1), as well as higher rates of diagnoses for other conditions and prescriptions within 12 months (pre and post) of the recorded first diagnosis in most categories. This suggests that hEDS/HSD, when considered together, do not meet the definition of rare conditions 23 , and have widespread effects across multiple body systems.
It is well-known that EDS is poorly recognised in children 30 31 . Furthermore, children with hEDS often present with symptoms that can lead to a misdiagnosis of mental illness or consideration of child abuse 12 32 . Suspicion of abuse has been shown to be extremely damaging to the mental health of the parent(s) and can lead to an avoidance of accessing health care or other public services, such as schools 33 . The prolonged and sometimes traumatic diagnosis and/or misdiagnosis process in EDS can lead to further disengagement with services 34 . The lack of a timely diagnosis has great implications for disease management and progression and impedes the appropriate consideration of surgical interventions 7 35-38 as well as pregnancy and birth planning 17 .

Strengths and Limitations
The strength of this study is that we were able to combine diagnostic codes from several primary and secondary health care providers to create a large cohort of individuals with EDS/JHS. We have 27 years of data with at least 11 years of very good data coverage in the  F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y 13 key datasets, which further improves with each data update of the SAIL databank, however data coverage for the first couple of years is less comprehensive.
The majority of subjects were identified via their primary care data, which is a strength and a weakness. As 89% of cases were identified through primary care data studies not using primary care data may underestimate the prevalence of hEDS/HSD. We are unable to quantify how many people are suffering from hEDS or HSD but remain undiagnosed.
However, we cannot comment on the reliability of the diagnoses in the primary care dataset.
It is also likely that the majority of cases were not actually diagnosed in primary care, but their entries were created through secondary care contacts, such as outpatient appointments or musculoskeletal assessment clinics, but coded data are lacking from these sources.
Although a snapshot of Read chapters codes that are more prevalent in our JHS/EDS cohort does not allow us to look at specific diagnoses and prescriptions, they can all be matched to conditions associated with EDS/JHS in the literature, for instance pain, fatigue, cardiovascular, gastrointestinal and gynaecological disorders, dysautonomia, mast cell activation as well as urinary tract infections 7 .
We conclude that EDS/HSD are not rare conditions and are associated with significantly increased odds of additional diagnoses and use of medications across many body systems.
There is a large gender difference in the age of diagnosis, with many women not diagnosed until adulthood. Early diagnosis, however, is crucial to patients, the administration of preventive therapies, the investigation of comorbid conditions and the overall management process. Further research is needed to understand patient pathways, comorbidities and progression of associated symptoms and diseases. Health services should be aware of these findings for the provision of training, diagnostic and treatment services for the many tens of thousands of patients living with these life-changing conditions throughout the United Kingdom and beyond. to the Publishers and its licensees in perpetuity, in all forms, formats and media (whether known now or created in the future), to i) publish, reproduce, distribute, display and store the Contribution, ii) translate the Contribution into other languages, create adaptations, reprints, include within collections and create summaries, extracts and/or, abstracts of the Contribution and convert or allow conversion into any format including without limitation audio, iii) create any other derivative work(s) based in whole or part on the on the Contribution, iv) to exploit all subsidiary rights to exploit all subsidiary rights that currently exist or as may exist in the future in the Contribution, v) the inclusion of electronic links from the Contribution to third party material wherever it may be located; and, vi) licence any third party to do any or all of the above. All research articles will be made available on an open access basis (with authors being asked to pay an open access fee-see http://www.bmj.com/about-bmj/resources-authors/forms-policies-and-checklists/copyrightopen-access-and-permission-reuse). The terms of such open access shall be governed by a Creative Commons licence-details as to which Creative Commons licence will apply to the research article are set out in our worldwide licence referred to above.
Dissemination statement: We are planning to disseminate our results to patient groups using social media.
Data sharing statement: The data used in this study are available in the SAIL databank at Swansea University, Swansea, UK. All proposals to use SAIL data are subject to review by an independent Information Governance Review Panel (IGRP). Before any data can be accessed, approval must be given by the IGRP.
The IGRP gives careful consideration to each project to ensure proper and appropriate use of SAIL data. When access has been granted, it is gained through a privacy-protecting safe haven and remote access system referred to as the SAIL Gateway. SAIL has established an application process to be followed by anyone who would like to access data via SAIL https://www.saildatabank.com/application-process  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59      they also emphasised that the approach to management and the prognosis in terms of disability are the same 44 . One may therefore conclude that health needs across these groups are similar. prescriptions, but also with significantly higher odds of a diagnosis in other disease categories (e.g. mental health, nervous and digestive systems) and higher odds of a prescription in most disease categories (e.g. gastro-intestinal and cardiovascular drugs) within the 12 months before and after the first recorded diagnosis.
Conclusions: EDS and JHS (since March 2017 classified as EDS or HSD) have historically been considered rare diseases only affecting the musculoskeletal system and soft tissues.
These data demonstrate that both of these assertions should be reconsidered. For many decades, studies have quoted a prevalence rate of 1 in 5000 for EDS, although the origin of this figure is unclear, seeming to appear first in a medical textbook 1 2 as an unreferenced "reasonable estimate". Thus, these syndromes have long been categorised as rare diseases, defined in the European Union as those affecting fewer than 50 in 100,000 Although common features of these conditions are arthralgia, soft tissue injury and joint instability 6 , over the last two decades it has become clear that their clinical features are not limited to musculoskeletal and cutaneous involvement, but are multisystemic [7][8][9] . In the special edition of the American Journal of Medical Genetics dedicated to EDS in March 2017, papers covered links to cardiovascular autonomic 10 and gastrointestinal dysfunction 11 as well as psychiatric and neurodevelopmental disorders 5 12 . Chronic disabling fatigue 13 and pain syndromes 14 were also recognised as common and multifactorial issues.
Gynaecological 15 16 and obstetric 17 issues are also reported in this population. There is also an emerging link with the potentially life-threatening condition of Mast Cell Activation  18 19 . There is some emerging evidence hinting that nutritional deficiencies 20 21 may play a key role, both seeming to be more prevalent in these patients and possibly implicated in the development of some of the complications.
Early diagnosis is found to be crucial to patients 22  It is possible that some of these difficulties arise from the widespread belief amongst clinicians that EDS is rare. It is therefore of clinical importance to establish better estimates of current prevalence. Conventional studies tend to be based in restricted clinical settings, such as rheumatology clinics, and are therefore limited by the number of recruited patients and biased by severity/type of patients referred. It has been shown that using linked health data is an economic and effective alternative to performing de novo longitudinal studies, including rare conditions 24 25 . We used routinely held data from primary and secondary care sources to examine the epidemiology of people with a diagnostic code for EDS/JHS in Wales. We then conducted a nested case-control study to study the number of diagnoses across all body/disease systems and prescription usage in order to test the widespread belief that these conditions are primarily musculoskeletal in nature, rather than multi-system disorders.  26 .

Cohort preparation
We identified Welsh residents with a Read Version 2 27 diagnostic code of EDS or JHS in primary care data or ICD-10 diagnostic codes 28 in secondary care data (hospital admissions) between 01/07/1990 (or the start of the dataset if later) and 30/06/2017. This date marks the end of maximum data coverage across all datasets. The EDS subclassification in Read Version 2 contains some, but not all, of the subtypes which were in use prior to 1997 and as a result, the reliability of any subtype data must be highly questionable (see Table 1). Due to the lack of available correct sub-codes for EDS subtypes, the fact that the overwhelming majority of patients simply had the header code (86% of those coded as EDS, with a further 12% coded as hEDS), and that other EDS types are genuinely rare, all codes for EDS were combined. ICD-10 codes do not distinguish between any subtypes of EDS (see Table 1). Only ALF's with good matching status were included in the We created one dataset for diagnoses in the GP data and another for diagnoses in the hospital data. Both datasets were linked to the week of birth, gender and date of death information in WDS on their ALF and then combined to create a cohort of people with EDS/JHS in either GP or hospital data, identifying any duplications and keeping the earliest diagnosis date for any individual appearing in both datasets.

Analysis
Data linkage and data preparation within the SAIL databank were conducted using IBM DB2 10.5 SQL. Data were then imported into R (Version 3.4.1) 29 , which was used for all statistical analyses. The mean age at first diagnosis between male and female subjects was compared and confidence intervals of the difference calculated.
The denominator of the diagnosed prevalence and incidence of EDS and JHS in secondary care was calculated based on the total number of individuals with recorded gender, registered and living in Wales between 01/07/1990 and 30/06/2017 for each full year of the study respectively. The prevalence and incidence in primary care denominator was further adjusted to include only people living in Wales and whose GP practice was contributing data to SAIL. The prevalence and incidence in primary and secondary care was then added together to create an overall estimate of the prevalence and incidence in Wales. consistently recorded data across their patient profile. The latter avoids diagnoses that were retrospectively entered for a time period when the GP practice did not fully implement the use of electronic records (less than 10% of the data they recorded during 2009). Although this reduced the number of cases and controls we were able to analyse, it avoids data quality bias, especially during the early years of this study, when GPs were converting to the use of computer systems and databases. Controls with any type of diagnosed hereditary connective tissue disorder were excluded. Preliminary analysis of the combined cohort indicated that adjustment for deprivation was not necessary (i.e. equal distribution of people across deprivation quintiles). We then calculated odds ratios between cases and controls using Read chapters (excluding the Read codes for EDS and JHS). All results that affected at least 5 cases or 20 controls were visualised using forest plots.

Ethical approval
The study design uses anonymised data and therefore the need for ethical approval and

Patient and Public Involvement
Two of the authors of this paper have been diagnosed with symptomatic joint hypermobility disorders. This study used routinely collected data, we were not able to involve members of the public but will be disseminating our findings widely, including directly to patients via social media and through our links with patient organisations.  (70%) of those diagnosed with EDS/JHS were female (see Figure 1).

EDS/JHS in Hospital Data
A total of 1,298 individuals were found in the hospital data of whom 970 (75%) were female: 745 (57%) had a diagnosis of JHS and 553 (43%) EDS (see Figure 1). (70%) of patients were female. The age at first diagnosis peaked in the age group 5-9 years for males and 15-19 years for females (see Figure 2). There was a significant difference of 8 This corresponds to about 10 cases in a practice of 5000 patients (see Figure   3). The incidence of EDS/JHS over this time period is shown in Supplement Figure 1. People that were diagnosed as adults (age >= 18 years) had also significantly more diagnoses in 16 out of 20 Read code categories than their controls (see Figure 4b).  It is well-known that EDS is poorly recognised in children 30 31 and initial symptoms and EDSassociated diagnoses can appear to be simply a 'normal' pattern of childhood illness when taken as an isolated event. Furthermore, children with hEDS often present with symptoms that can lead to a misdiagnosis of mental illness or consideration of child abuse 12 32 .

Discussion
Suspicion of abuse has been shown to be extremely damaging to the mental health of the parent(s) and can lead to an avoidance of accessing health care or other public services, such as schools 33 . The prolonged and sometimes traumatic diagnosis and/or misdiagnosis process in EDS can lead to further disengagement with services 34 . The lack of a timely diagnosis has great implications for disease management and progression and impedes the appropriate consideration of surgical interventions 7 35-38 as well as pregnancy and birth planning 17 . It is perhaps only in stepping back to look at the pattern of effects across

Strengths and Limitations
The strength of this study is that we were able to combine diagnostic codes from several primary and secondary health care providers to create a large cohort of individuals with EDS/JHS. We have 27 years of data with at least 11 years of very good data coverage in the key datasets, which further improves with each data update of the SAIL databank, however data coverage for the first couple of years is less comprehensive.
The majority of subjects were identified via their primary care data, which is a strength and a weakness. As 89% of cases were identified through primary care data studies not using primary care data may underestimate the prevalence of hEDS/HSD. We are unable to quantify how many people are suffering from hEDS or HSD but remain undiagnosed.
However, we cannot comment on the reliability of the diagnoses in the primary care dataset.
It is also likely that the majority of cases were not actually diagnosed in primary care, but their entries were created through secondary care contacts, such as outpatient appointments or musculoskeletal assessment clinics, but coded data are lacking from these sources.
Although a snapshot of Read chapters codes that are more prevalent in our JHS/EDS cohort does not allow us to look at specific diagnoses and prescriptions, they can all be matched to conditions associated with EDS/JHS in the literature, for instance pain, fatigue, cardiovascular, gastrointestinal and gynaecological disorders, dysautonomia, mast cell activation as well as urinary tract infections 7 . We hope in future work to examine in greater detail these findings of significant differences between people with hEDS/HSD and others in order that we can better understand the nature of this condition, as well as potentially improving diagnostic recognition. Having created this case-control cohort, further examination is made simpler as this first step has already been made.  13 We conclude that EDS/HSD are not rare conditions and are associated with significantly increased odds of additional diagnoses and use of medications across many body systems.
There is a large gender difference in the age of diagnosis, with many women not diagnosed until adulthood. Early diagnosis, however, is crucial to patients, the administration of

Funding
This work was supported by The Farr Institute. The Farr Institute was supported by a 10- Foundation (BHF) and the Wellcome Trust.
The above funders played no role in the study design, in the collection, analysis, or interpretation of data; in the writing of the report; or in the decision to submit the article for publication. The researchers are independent from the funders and all authors, external and internal, had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Data sharing statement: The data used in this study are available in the SAIL databank at Swansea University, Swansea, UK. All proposals to use SAIL data are subject to review by an independent Information Governance Review Panel (IGRP). Before any data can be accessed, approval must be given by the IGRP.
The IGRP gives careful consideration to each project to ensure proper and appropriate use of SAIL data. When access has been granted, it is gained through a privacy-protecting safe haven and remote access system referred to as the SAIL Gateway. SAIL has established an application process to be followed by anyone who would like to access data via SAIL https://www.saildatabank.com/application-process.   Castori et al showed that patients may move from the HSD category into hEDS over time: they also emphasised that the approach to management and the prognosis in terms of disability are the same 44 . One may therefore conclude that health needs across these groups are similar.   Presented are all results that affect at least 5 cases or 20 controls (perinatal conditions,  Presented are all results that affect at least 5 cases or 20 controls (incontinence and stoma appliances, chapters q and s, are not shown as neither young people nor adults had the required minimum number of cases/results).

Supplement Figures
Supplement Figure 1: Incidence of diagnosis of JHS/EDS in GP, hospital inpatient and combined data over time.  or roughly 10 cases in a practice of 5000 patients. There was a pronounced gender difference of 8.5 years (95% CI: 7.70 to 9.22) in the mean age at diagnosis. EDS or JHS was not only associated with high odds for other musculoskeletal diagnoses and drug prescriptions, but also with significantly higher odds of a diagnosis in other disease categories (e.g. mental health, nervous and digestive systems) and higher odds of a prescription in most disease categories (e.g. gastro-intestinal and cardiovascular drugs) within the 12 months before and after the first recorded diagnosis.
Conclusions: EDS and JHS (since March 2017 classified as EDS or HSD) have historically been considered rare diseases only affecting the musculoskeletal system and soft tissues.
Gynaecological 15 16 and obstetric 17 issues are also reported in this population. There is also an emerging link with the potentially life-threatening condition of Mast Cell Activation  18 19 . There is some emerging evidence hinting that nutritional deficiencies 20 21 may play a key role, both seeming to be more prevalent in these patients and possibly implicated in the development of some of the complications.
Early diagnosis is found to be crucial to patients 22 to enable the provision of appropriate treatment, as well as to prevent later onset complications 7 . Establishing the diagnosis of EDS/HSD is often problematic for patients, which interferes with the early detection, treatment and prevention of further escalations of recognised symptoms, disability and more elaborate complications. A mean of 14 years elapses between the first clinical manifestations and the actual diagnosis 23 . For 25% of patients this delay lasts over 28 years 23 . "A misdiagnosis was given to 56% of patients [resulting in] inappropriate treatment in 70% of the patients… For 86% of the patients, the delay in diagnosis was considered responsible for deleterious consequences." 23(p.137) It is possible that some of these difficulties arise from the widespread belief amongst clinicians that EDS is rare. It is therefore of clinical importance to establish better estimates of current prevalence. Conventional studies tend to be based in restricted clinical settings, such as rheumatology clinics, and are therefore limited by the number of recruited patients and biased by severity/type of patients referred. It has been shown that using linked health data is an economic and effective alternative to performing de novo longitudinal studies, including rare conditions 24 25 . We used routinely held data from primary and secondary care sources to examine the epidemiology of people with a diagnostic code for EDS/JHS in Wales. We then conducted a nested case-control study to study the number of diagnoses across all body/disease systems and prescription usage in order to test the widespread belief that these conditions are primarily musculoskeletal in nature, rather than multi-system disorders.

Cohort preparation
We identified Welsh residents with a Read Version 2 27 diagnostic code of EDS or JHS in primary care data or ICD-10 diagnostic codes 28 in secondary care data (hospital admissions) between 01/07/1990 (or the start of the dataset if later) and 30/06/2017. This date marks the end of maximum data coverage across all datasets. The EDS subclassification in Read Version 2 contains some, but not all, of the subtypes which were in use prior to 1997 and as a result, the reliability of any subtype data must be highly questionable (see Table 1). Due to the lack of available correct sub-codes for EDS subtypes, the fact that the overwhelming majority of patients simply had the header code (86% of those coded as EDS, with a further 12% coded as hEDS), and that other EDS types are genuinely rare, all codes for EDS were combined. ICD-10 codes do not distinguish between any subtypes of EDS (see Table 1). Only ALF's with good matching status were included in the We created one dataset for diagnoses in the GP data and another for diagnoses in the hospital data. Both datasets were linked to the week of birth, gender and date of death information in WDS on their ALF and then combined to create a cohort of people with EDS/JHS in either GP or hospital data, identifying any duplications and keeping the earliest diagnosis date for any individual appearing in both datasets.

Analysis
Data linkage and data preparation within the SAIL databank were conducted using IBM DB2 10.5 SQL. Data were then imported into R (Version 3.4.1) 29 , which was used for all statistical analyses. The mean age at first diagnosis between male and female subjects was compared and confidence intervals of the difference calculated.
The denominator of the diagnosed prevalence and incidence of EDS and JHS in secondary care was calculated based on the total number of individuals with recorded gender, registered and living in Wales between 01/07/1990 and 30/06/2017 for each full year of the study respectively. The prevalence and incidence in primary care denominator was further adjusted to include only people living in Wales and whose GP practice was contributing data to SAIL. The prevalence and incidence in primary and secondary care was then added together to create an overall estimate of the prevalence and incidence in Wales.

Case-Control Comparison
A nested case control method was used. Each case was matched to 4 controls with the same gender and similar age profiles (within 45 days of the week of birth). We implemented strict criteria for selection to the case-control cohort. Both cases and controls had to (a) have uninterrupted GP registrations for 1 year before and 1 year after the date of the relevant diagnosis (or died during follow-up); (b) be registered with a GP submitting data to SAIL either at the matching date or afterwards; (c) have been registered with a GP that  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y   8 consistently recorded data across their patient profile. The latter avoids diagnoses that were retrospectively entered for a time period when the GP practice did not fully implement the use of electronic records (less than 10% of the data they recorded during 2009). Although this reduced the number of cases and controls we were able to analyse, it avoids data quality bias, especially during the early years of this study, when GPs were converting to the use of computer systems and databases. Controls with any type of diagnosed hereditary connective tissue disorder were excluded. Preliminary analysis of the combined cohort indicated that adjustment for deprivation was not necessary (i.e. equal distribution of people across deprivation quintiles). We then calculated odds ratios between cases and controls using Read chapters (excluding the Read codes for EDS and JHS). This method counts the number of people with a code in each category; multiple codes for the same person in the same category are therefore not included. All results that affected at least 5 cases or 20 controls were visualised using forest plots.

Ethical approval
The study design uses anonymised data and therefore the need for ethical approval and participant consent was waived by the approving Institutional Review Board, the UK National Health Service Research Ethics Committee. The SAIL independent Information Governance Review Panel (IGRP) approved the study.

Patient and Public Involvement
Two of the authors of this paper have been diagnosed with symptomatic joint hypermobility disorders. This study used routinely collected data, we were not able to involve members of the public but will be disseminating our findings widely, including directly to patients via social media and through our links with patient organisations. (70%) of those diagnosed with EDS/JHS were female (see Figure 1).

EDS/JHS in Hospital Data
A total of 1,298 individuals were found in the hospital data of whom 970 (75%) were female: 745 (57%) had a diagnosis of JHS and 553 (43%) EDS (see Figure 1). (70%) of patients were female. The age at first diagnosis peaked in the age group 5-9 years for males and 15-19 years for females (see Figure 2). There was a significant difference of 8 10 study period. This corresponds to about 10 cases in a practice of 5000 patients (see Figure   3). The incidence of EDS/JHS over this time period is shown in Supplement Figure 1.

Factors associated with JHS/EDS
2,597 cases had good GP data coverage at the age of diagnosis and could be matched by age and gender with controls (see Figure 1). 1,340 cases (male: 561; female: 779) were first diagnosed before the age of 18 years and 1,254 cases (male: 229; female: 1,025) above this age. The people in the nested case-control cohort were slightly older than the overall cohort (data not shown here).
Looking at the time frame of 1 year either side of the first coded diagnosis of EDS/JHS amongst young people (age < 18 years) there were significantly more additional diagnoses in 16 out of 20 Read code disease categories compared with their controls (see Figure 4a). People that were diagnosed as adults (age >= 18 years) had also significantly more diagnoses in 16 out of 20 Read code categories than their controls (see Figure 4b).  It is well-known that EDS is poorly recognised in children 30 31 and initial symptoms and EDSassociated diagnoses can appear to be simply a 'normal' pattern of childhood illness when taken as an isolated event. Furthermore, children with hEDS often present with symptoms that can lead to a misdiagnosis of mental illness or consideration of child abuse 12 32 .

Discussion
Suspicion of abuse has been shown to be extremely damaging to the mental health of the parent(s) and can lead to an avoidance of accessing health care or other public services, such as schools 33 . The prolonged and sometimes traumatic diagnosis and/or misdiagnosis process in EDS can lead to further disengagement with services 34 . The lack of a timely diagnosis has great implications for disease management and progression and impedes the  17 . It is perhaps only in stepping back to look at the pattern of effects across multiple body systems that practitioners might begin to consider a connective tissue disorder.

Strengths and Limitations
The strength of this study is that we were able to combine diagnostic codes from several primary and secondary health care providers to create a large cohort of individuals with EDS/JHS. We have 27 years of data with at least 11 years of very good data coverage in the key datasets, which further improves with each data update of the SAIL databank, however data coverage for the first couple of years is less comprehensive.
The majority of subjects were identified via their primary care data, which is a strength and a weakness. As 89% of cases were identified through primary care data studies not using primary care data may underestimate the prevalence of hEDS/HSD. We are unable to quantify how many people are suffering from hEDS or HSD but remain undiagnosed.
However, we cannot comment on the reliability of the diagnoses in the primary care dataset.
It is also likely that the majority of cases were not actually diagnosed in primary care, but their entries were created through secondary care contacts, such as outpatient appointments or musculoskeletal assessment clinics, but coded data are lacking from these sources.
Although a snapshot of Read chapters codes that are more prevalent in our JHS/EDS cohort does not allow us to look at specific diagnoses and prescriptions, they can all be matched to conditions associated with EDS/JHS in the literature, for instance pain, fatigue, cardiovascular, gastrointestinal and gynaecological disorders, dysautonomia, mast cell activation as well as urinary tract infections 7 . It needs to be stressed that these results exclude codes for EDS/JHS and that these are not part of the results for congenital anomalies or musculoskeletal conditions. We hope in future work to examine in greater detail these findings of significant differences between people with hEDS/HSD and others in  39 , and is less likely to be due to a higher rate of use of chemotherapeutic agents.
We conclude that EDS/HSD are not rare conditions and are associated with significantly increased odds of additional diagnoses and use of medications across many body systems.
There is a large gender difference in the age of diagnosis, with many women not diagnosed until adulthood. Early diagnosis, however, is crucial to patients, the administration of

Results
Participants 13 (a) Report the numbers of individuals at each stage of the study (e.g., numbers potentially eligible, examined for eligibility, confirmed eligible, included in the study, completing follow-up, and analysed) (b) Give reasons for nonparticipation at each stage. (c) Consider use of a flow diagram RECORD 13.1: Describe in detail the selection of the persons included in the study (i.e., study population selection) including filtering based on data quality, data availability and linkage. The selection of included persons can be described in the text and/or by means of the study flow diagram.
Results p. 9-12 Figure 1 Descriptive data 14 (a) Give characteristics of study participants (e.g., demographic, clinical, social) and information on exposures and potential confounders (b) Indicate the number of participants with missing data for each variable of interest (c) Cohort study -summarise follow-up time (e.g., average and total amount) a) p. 9-12 b) only exact matches, cannot identify missing data c) NA