Article Text

Download PDFPDF

Original research
Using natural language processing to extract self-harm and suicidality data from a clinical sample of patients with eating disorders: a retrospective cohort study
  1. Charlotte Cliffe1,2,
  2. Aida Seyedsalehi2,
  3. Katerina Vardavoulia2,
  4. André Bittar2,
  5. Sumithra Velupillai2,
  6. Hitesh Shetty2,
  7. Ulrike Schmidt1,2,
  8. Rina Dutta1,2
  1. 1South London & Maudsley, NHS Foundation Trust, London, UK
  2. 2Institute of Psychiatry, Psychology and Neuroscience, Kings College London, London, UK
  1. Correspondence to Dr Charlotte Cliffe; charlotte.cliffe{at}


Objectives The objective of this study was to determine risk factors for those diagnosed with eating disorders who report self-harm and suicidality.

Design and setting This study was a retrospective cohort study within a secondary mental health service, South London and Maudsley National Health Service Trust.

Participants All diagnosed with an F50 diagnosis of eating disorder from January 2009 to September 2019 were included.

Intervention and measures Electronic health records (EHRs) for these patients were extracted and two natural language processing tools were used to determine documentation of self-harm and suicidality in their clinical notes. These tools were validated manually for attribute agreement scores within this study.

Results The attribute agreements for precision of positive mentions of self-harm were 0.96 and for suicidality were 0.80; this demonstrates a ‘near perfect’ and ‘strong’ agreement and highlights the reliability of the tools in identifying the EHRs reporting self-harm or suicidality. There were 7434 patients with EHRs available and diagnosed with eating disorders included in the study from the dates January 2007 to September 2019. Of these, 4591 (61.8%) had a mention of self-harm within their records and 4764 (64.0%) had a mention of suicidality; 3899 (52.4%) had mentions of both. Patients reporting either self-harm or suicidality were more likely to have a diagnosis of anorexia nervosa (AN) (self-harm, AN OR=3.44, 95% CI 1.05 to 11.3, p=0.04; suicidality, AN OR=8.20, 95% CI 2.17 to 30.1; p=0.002). They were also more likely to have a diagnosis of borderline personality disorder (p≤0.001), bipolar disorder (p<0.001) or substance misuse disorder (p<0.001).

Conclusion A high percentage of patients (>60%) diagnosed with eating disorders report either self-harm or suicidal thoughts. Relative to other eating disorders, those diagnosed with AN were more likely to report either self-harm or suicidal thoughts. Psychiatric comorbidity, in particular borderline personality disorder and substance misuse, was also associated with an increase risk in self-harm and suicidality. Therefore, risk assessment among patients diagnosed with eating disorders is crucial.

  • eating disorders
  • suicide & self-harm
  • biotechnology & bioinformatics
  • epidemiology

Data availability statement

Data are available on reasonable request. Data are available on request due to privacy/ethical considerations.The data are made available under specific governance requirements: researchers need to have a contract with the South London and Maudsley NHS Trust, which can be applied for relevant research studies. Each research project is reviewed by a service-user led oversight committee of the National Institute of Health Research Biomedical Research Centre. On request, and after appropriate arrangements, the data and modelling employed in this study can be viewed within the secure system firewall.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Strengths and limitations of this study

  • The size of the cohort is over 7400 patients.

  • Long period of follow-up (12.5 years).

  • Limited number of study designs (most cross-sectional) reporting on suicidal behaviour among those with eating disorders (EDs).

  • The tools used to detect self-harm and suicidality are not able to consider the temporality in relation to the ED diagnosis; therefore, the suicidal behaviour could have been detected prior to diagnosis.

  • The clinical records are routine clinical data not primarily collected for research, therefore, rely on clinician documentation.


Patients diagnosed with eating disorders (EDs), including anorexia nervosa (AN), bulimia nervosa (BN) and ED not otherwise specified (EDNOS) (the Diagnostic and Statistical Manual of Mental Disorders, 5th Edition (DSM-5) now refers to ‘otherwise specified feeding or ED’; but the studies and data included in this paper used the DSM-IV equivalent term of EDNOS), are at a greater risk of mortality compared with the general population.1 2 A major contribution to this increased mortality rate is the higher risk of completed suicide in patients with EDs.3 Individuals with a lifetime diagnosis of AN and BN are 18 and 7 times more likely to die from suicide compared with age-matched general population controls, respectively.4 5 Those with a diagnosis of EDNOS are four times more likely to complete suicide.6 Therefore, given the elevated risk of suicide among patients diagnosed with EDs, it is of utmost importance that factors associated with this risk are determined.7

Self-harm (SH) and suicidal ideation (SUI) are both strong predictors of subsequent suicide.8 SH can be defined as ‘self-injurious behaviour characterised by deliberate harm to the body in the absence of an intent to die’9 and SUI can be defined as ‘thoughts about killing oneself, which may or may not include a plan’.10 It has been determined that a common antecedent for completed suicide in the general population, is previous SH, with up to 60% of people who complete suicide having previously self-harmed, the majority within 1 year prior to the attempt.11 12 Lifetime SUI is also associated with attempted suicide (up to 30%); those with a plan have an increased risk of completed suicide (up to 55%) and the majority of attempts occur within the first year of the onset of SUI.13 Therefore, identifying patients who report either lifetime SUI and SH is an important clinical marker for those at risk of later suicide.

Previous studies have demonstrated the association between suicidality, SH and EDs.14–17 Our previous study focusing on suicide attempts, demonstrated the cumulative 10-year incidence of suicide attempts in a population of patients with EDs as 6.8%.17 Rates of SH have been reported as high as 42% for AN, up to 55% for BN18 and 26% for EDNOS.19 A recent meta-analysis summarised that 22% of patients with AN and 33% of patients with BN reported lifetime SH.20

Studies have reported mixed findings in terms of suicide attempts across ED diagnostic categories,21–24 with many showing no difference in suicide attempts between ED subtypes, some demonstrated higher rates of suicide attempts and SH in AN compared with BN17 23 25 26 and others reported more frequent suicide attempts and ideation in BN compared with AN.24 27 Furthermore, binge ED (BED), a relatively new diagnostic category, has also been associated with increased suicidality.22 In other studies, it appears that binge eating and purging are particularly associated with increased risks of attempted suicide, due to their association with impulsivity.26 28 Some of these heterogeneous findings have been attributed to differences in patient settings (outpatient or inpatient),21 diagnostic subtyping (eg, restricting vs binge-purging AN)28 or the methods used for determining suicide attempts.26

Some studies have focused on risk factors for developing suicidal behaviour among those with EDs. A number of risk factors have been identified, such as younger age of ED onset, specific personality traits, comorbid disorders, negative life events and substance misuse.17 26 29 However, there are limitations with a number of past studies in terms of low numbers of suicidal behaviour within the study population, resulting in low power.5 One possibility to improve this problem is to use longitudinal psychiatric case records, such as electronic health records (EHRs). This captures a large enough population manifesting suicidal behaviour, to ensure sufficient power.30

The increasing use of EHRs in hospital care systems, alongside the growth of health informatics allows us to develop computational tools that can analyse these large clinical datasets.31 Natural language processing (NLP) tools allow us to determine information about symptomatology from information written in free-text EHRs.32 Previous research has shown that using NLP applications increases the positive predictive value (PPV) of detecting patient-level suicidality.33 This is of particular use for suicidal behaviour, as both positive and negated mentions of suicidality and SH are routinely reported within free text during psychiatric assessments and follow-up.31 34 35

The aim of this study was to evaluate two NLP tools, one that identifies mentions of SH,36 and the other that identifies suicidality35 for a cohort of ED patients. To achieve this, we compared the performance of the NLP tools against a gold-standard set of manually annotated documents, using previously defined coding rules. We then used the tools to identify positive mentions of either SH or suicidality on a patient level, to evaluate the incidence of SH and suicidality in patients diagnosed with EDs over a 12-year period.


Study design and setting

This study is a retrospective cohort study using data obtained from South London and Maudsley National Health Service Foundation trust (SLaM). This is a mental health service serving an estimated population of 2 million residents of southeast London. Patients come from the London boroughs of Croydon, Southwark, Lambeth, Lewisham, Bromley, Bexley and Greenwich. SLaM has had fully electronic records since 2006 and the National Institute for Health Research funded Biomedical Research Centre supports the infrastructure for rendering its anonymised records available for research. We analysed the data as ‘event notes’ in the EHRs, irrespective whether they were created during an inpatient stay, during follow-up or a telephone appointment.

Patient and public involvement

No patient involved.

Inclusion criteria and exposures

The analysed cohort was extracted via the Clinical Record Interactive Search (CRIS) system37 and comprised of individuals who received an International Classification of Diseases, 10th Revision (ICD-10)38 diagnosis of an ED (F50.0–F50.9) within the 12-year observation period of 1 January 2007 to 31 September 2019. These patients were identified using two data sources available within the EHRs. First, structured information on diagnosis from drop down fields in the source record. Second, structured variables which are routinely extracted from open text fields using a bespoke algorithm generated by the Generalised Architecture for Text Engineering software.39 The comorbidity exposures of interest were diagnoses of substance misuse (F10–F19), bipolar disorder (F31), anxiety disorders, depression (F32 and F33) and personality disorder (PD) (F60) determined by structured information on the EHRs in the drop-down fields in the source record.

Primary outcomes

The outcomes of interest were a patient reporting at least one positive mention of SH or one positive mention of suicidality. Information on these outcomes was extracted using NLP applications that have been previously developed and used within similar datasets.31 34 35 The first application used rule-based linguistic processing to identify positive mentions of SH in clinical texts, this included historic and current episodes, but did not include SH ideation. The second application, also rule-based and using lexical resources, included SUI of both a passive and active nature; both of these were recorded as a binary outcome. A detailed description of the development and evaluation of both NLP tools used to identify mentions of SH and suicidality are described in previous studies.35 36 40

Workflow for validating the NLP tools

Figure 1 shows the workflow for validating the NLP tools to determine the primary outcomes. All F50 diagnoses between 1 January 2007 and 31 March 2019 were included in the validation; this period of time was 6 months shorter than the final analysis due to the lag time between the validation and final statistical analysis. In total, 7188 patients met the inclusion criteria, of which 6972 had at least one EHR document available. Overall, 1 054 640 documents were available for these patients. For all 6972 patients, the NLP tools were used to search for mentions of both suicidality and SH. In total, 5456 patients had positive mentions of either SH or SUI, 4741 had any mention of SH, 4528 had any mention of SUI, and 3813 patients had both SH and SUI mentioned. Manual annotations were compared with the NLP tool annotations and attribute agreements were calculated.41

Figure 1

Work flow for validation of both NLP tools. NLP, natural language processing; DSH/SH, self-harm; SUI, suicidal ideation.

From these patients, a sample of documents was randomly extracted. This was achieved by first restricting the patients to those who had a number of EHR documents within the first and third quartiles, to eliminate outliers with very few documents or with excessive documentation. This resulted in 2923 patients in total with positive mentions of either SH or SUI (135 317 documents), 2431 patients with a positive mention of SH (114 962 documents), 2294 patients with a positive mention of SUI (110 399 documents) and 1802 patients with a positive mention of both SH and SUI (90 044 documents). Each patient had a minimum of 17 documents and maximum of 99 documents.

A randomised sample of 500 documents was taken for manual review: 100 with a positive mention of suicidality only, 100 with a positive mention of SH only, 100 with a mention of both SH and suicidality and 200 with no mention of either. Three manual coders, including one clinically trained psychiatrist (CC, AS and SV), were assigned either suicidality (AS, 400 documents), SH (SV, 400 documents) or both (CC, 500 documents) for review. The sets were independently classified with 300 of them crossing over and classified by all three authors.

For the suicidality documents, two coders (CC and AS) independently labelled each document as suicidal, non-suicidal or uncertain. Inter-rater agreement was measured using Cohen’s kappa and the F1 statistic on a document level to determine inter-rater reliability.41 Any discrepancies were discussed and clarified to develop a ‘gold standard’ set of documents. The same principle was applied to mentions of SH within the documents, determined by two coders (CC and SV). Any mention of SH within the document was coded as positive, negative and whether relevant or non-relevant, for example, a positive code refers to the note referring to an act of SH by the individual, negative refers to a denial or negated act of SH. If the mention was about a friend or family member that was not relevant to the patient non-relevant was coded (see figure 1).

Testing the algorithms

The performance of each NLP tool was tested by comparing the output of the application against the ‘gold-standard’ set of manual annotations and calculating precision (PPV) and recall (sensitivity) statistics. Good inter-rater agreement between the NLP output and gold standard was indicated by a Cohen’s kappa of 0.80 for identifying both suicidality and SH. Scores >0.80 demonstrate a ‘strong’ level of agreement and reliable data, scores >0.90 are ‘almost perfect’ agreement and scores >0.60 were considered ‘moderate’ in agreement.41


The year and month of birth, gender, ethnicity, deprivation score and marital status were retrieved from the CRIS database. Age in years was calculated from the individual’s first ED diagnosis in the observation window or from January 2007 if the diagnosis preceded the observation period. We used the ‘multiple deprivation score’ which is a small-area-level measure of socioeconomic status, based on the individual’s address closest to the diagnosis of the ED in the observation window, covering seven components: employment, income, education, health, barriers to housing and services, crime and the living environment with specific weightings. The index of multiple deprivation is a well-established measure that has been widely used as a regional indicator for socioeconomic status in previous studies; the scores are transformed into percentiles (1–100) with higher scores indicating greater deprivation. The deprivation score was grouped into tertiles (33rd percentiles) and converted into a categorical variable. Previous studies have used this method of categorical definition using the same data source.2

Statistical analysis

Analysis was completed using Strata (version 13) software. All patients were eligible for analysis. Descriptive statistics were used to characterise the patients. Logistic regression was used to calculate odd ratios with 95% CIs with SH or suicidality as the ‘outcome’ and the comorbid psychiatric diagnoses as exposure. ED diagnoses were categorised into AN (both restricting and purging types), BN and all other F50 diagnoses. For those with multiple diagnoses, a diagnostic hierarchy of AN >BN>other was used. The observation period started from the first date of diagnosis or 1 Jan 2007 if the diagnosis was made prior to this date and the ended on 31 September 2019 (this was six months longer than the validation period of data extraction). Univariate logistic regression was used to estimate the effect of the primary ED diagnosis, demographic characteristics and psychiatric comorbidities on each of the outcomes of interest (SH and SUI). Next, multivariable analyses were performed to calculate the adjusted OR and 95% CI for each comorbid psychiatric diagnosis, while controlling for demographics and the ED diagnosis, the effect of the psychiatric comorbidities and demographics.


Descriptive statistics

Table 1 summarises the different types of ED diagnosis by age. The mean age was 26.0 (SD 11; range 10–90).

Table 1

Summary of all diagnoses by age group (11 patients had no detailed information about the diagnosis other than ‘F50’)

SH and suicidality among patients

The attribute agreements for the final corpus of documents on SH and suicidality are displayed below in table 2. The three attributes include ‘positive’ that is, there is a mention of either SH or suicidality, ‘negative or non’ that is, there is a denial of SH or suicidality and ‘relevant’, that is, the mention is relevant to the patient and not a family member of friend. A summary of those reporting SH or suicidality by age are displayed in table 3.

Table 2

Attribute agreements

Table 3

Self harm and suicidality reported among patients by age

SH-reported among patients with EDs

Patients who reported SH (past or present) were more likely to be younger in age (OR=0.98, 95% CI 0.97 to 0.98; p<0.001), less likely to be female (OR=0.67, 95% CI 0.58 to 0.79; p<0.001) more likely to be of white ethnicity (OR=1.40, 95% CI 1.10–1.78; p=0.006) and more likely to have a diagnosis of AN (OR=3.44, 95% CI 1.05 to 11.3; p=0.04). They were also more likely to have a comorbid diagnosis; in particular a diagnosis of borderline PD (BPD; OR=54.2, 95% CI 24.2 to 121.4; p<0.001), bipolar disorder (OR=9.57, 95% CI 5.57 to 15.4; p<0.001) and substance misuse (OR=7.22, 95% CI 2.94 to 18.3; p<0.001); as displayed in table 4.

Table 4

Univariable logistic regression to determine the effect of demographics, primary ED diagnosis and psychiatric comorbidities on risk of self-harm

Suicidality reported among patients with EDs

Patients who reported suicidality were more likely to be younger (OR=0.98, 95% CI 0.97 to 0.99; p<0.001), of white ethnicity (OR=1.59, 95% CI 1.23 to 2.10; p<0.001), less likely to be married or with a partner (OR=0.76, 95% CI 0.65 to 0.90; p=0.001 and have a diagnosis of AN (OR=8.20, 95% CI 2.17 to 30.1; p=0.002). They were also more likely to have a comorbid diagnosis, in particular BPD (OR=26.2, 95% CI 14.4 to 47.7; p<0.001), bipolar disorder (OR=9.31, 95% CI 5.31 to 16.3; p<0.001) and alcohol misuse (OR=6.59, 95% CI 3.56 to 12.2; p<0.001), as seen in table 5.

Table 5

Univariable logistic regression to determine the effect of demographics, primary ED diagnosis, and psychiatric comorbidities on risk of suicidality

Multivariable analysis of the effect of comorbid psychiatric diagnoses on SH and suicidality

When adjusting for demographics and the primary ED diagnosis, depression, bipolar disorder, other PD, substance misuse and alcohol use disorder remained significantly associated with suicidal behaviour. However, after adjusting for the demographics BPD remained only associated with SH (OR 2.84, 95% CI 0.84 to 9.68, p=0.09) and not with suicidality (OR=1.52, 95% CI 0.51 to 4.50, p=0.45). Anxiety disorders remained associated with suicidality (OR=1.93, 95% CI 1.01 to 3.69, p=0.05) but not SH (OR=1.47, 95% CI 0.81 to 2.65, p=0.20 as shown in table 6A,B.

Table 6

(A) Multivariable logistic regression examining the association between psychiatric comorbidities and self-harm; adjusted for demographics and ED diagnosis


Accuracy of the NLP output

The attribute agreements for precision of positive mentions of SH were >0.90 and for suicidality were >0.80; this demonstrates a ‘strong’ and ‘near perfect’ agreement and when compared with manual annotations41 demonstrating the validity of the tool. However, negative polarity appeared less accurate for both tools, which demonstrates that the NLP tools were better at picking up positive and relevant mentions of both SH and suicidality within the clinical notes, than negative mentions. This is likely due to errors in the linguistic pre-processing needed to identify negation. As we are relying on at least one positive mention to ascertain those with any past or current history of suicidal behaviour, this is unlikely to significantly impact the validity of the results.

Discussion of clinical findings

This study highlights the high lifetime prevalence (>60%) of both SH and suicidality reported among those diagnosed with EDs in both inpatient and outpatient settings. One explanation for the high rates of suicidal behaviour is that patients with EDs are at an increased risk of psychiatric comorbidities,1 2 particularly mood disorders, substance misuse and PDs.29 42 It is well documented that patients with comorbidities are more likely to SH and attempt suicide.43 44 However, studies have demonstrated that even when adjusted for comorbid disorders, the risk of suicidal behaviour remains higher in patients with EDs than in the general population and comorbid disorders just elevate that risk further.17 42 45

In our study, psychiatric comorbidity was associated with increased suicidal behaviour. In particular, BPD was associated with highly elevated odds of SH and suicidality, prior to adjustment. When adjusted, BPD increased the odds of SH, but interestingly not suicidality; although this adjusted association could reflect a lack of statistical power, as the cell size was small and CIs wide. This is consistent with previous studies as BPD presents with emotional dysregulation and impulsivity; associated with SH and ED symptoms such as bingeing or purging.18 46 Furthermore, psychotherapies aimed at supporting those diagnosed with BPD and SH have been shown to be effective at also supporting patients with a diagnosis of ED.47 48

Similarly, those with a diagnosis of alcohol or substance misuse had an elevated odds of reporting SH and suicidality. Substance and alcohol misuse are associated with impulsivity; impulsivity is associated with behaviours such as bingeing and purging and suicidal behaviour49–51 which has been shown to increase risk of completed suicide.52 53 Bipolar disorder was also significantly associated with a fivefold increase in odds of suicidal behaviour when adjusted for demographics and the primary ED diagnosis. This is consistent with previous studies demonstrating an increased risk of hospitalised suicide attempts in ED patients with bipolar disorder compared with those without.17

Relative to BN and other EDs, AN presented with the highest risk of suicidal behaviour, particularly suicidality. This is consistent with previous studies reporting a higher prevalence of suicide attempts and completed suicide in individuals with AN compared with those with BN or other EDs.5 17 23 However, it is important to consider the number of studies reporting suicidal behaviour most prevalent in BN.24 51 54 One explanation for the difference between our results and the above findings is that the current study used a diagnostic hierarchy of AN >BN>EDNOS to assign a primary ED diagnosis to patients; we know there is a well-established diagnostic crossover between EDs, with 50% of patients initially being diagnosed with AN being rediagnosed with BN or AN-binge purge subtype.55 Evidence also indicates that individuals experiencing diagnostic cross-over may be at particularly elevated risk of suicidality.56 Therefore, there could be a subtype of particular interest; future investigations should focus on diagnostic flux and whether the suicidal behaviour risk correlates to fluctuating ED symptoms.26

This study highlights the importance of further understanding the shared mechanisms for suicidal behaviour and ED diagnosis. There are various explanations that have been hypothesised for the high risk of SH and suicidality; some studies have suggested there are shared genetic factors predisposing to both conditions.57 58 Others suggest that emotional dysregulation is associated with EDs and others demonstrate that adjusting for comorbid psychiatric disorders weakens any association.22 57 58 Increased pain tolerance and fearlessness for death are other hypotheses for the increased risk among patients diagnosed with EDs.59 The interpersonal theory of suicide describes that a higher lethality attempt requires both a desire for death and capability for suicide; capability of suicide has been theorised as developing after gradual chronic exposure to painful ED behaviours and habituation to fear and pain.60 61 Therefore, extreme restrictive eating may differentiate AN from other EDs, increasing the capability of both SH and suicidality.61

Strengths and limitations

The main strengths of this study are the size of the cohort (>7400), the longitudinal study design and long period of time for follow-up (12.5 years), facilitated by the use of the CRIS database. There is currently a limited body of research on correlates and risk factors for suicidal behaviour among ED patients and previous studies have small numbers and a high usage of cross-sectional studies as well as studies at risk of reporting bias.26 The NLP approach used to extract clinician documentation of SH and suicidality from narrative text in EHRs reduces the risk of reporting bias and allows access to detailed clinical information that would not be available from EHR structured fields.30 35

The main limitation of this study is that the tools were not able to consider the timing of reported suicidality or SH relative to the ED diagnosis. Therefore, it is possible the reported suicidal behaviour was prior to ED diagnosis; an improvement of the NLP tool would be to include temporality to understand specific time periods of risk for SH or reported suicidality. Another consideration is that due to changing diagnostic codes between the follow-up period of 2007–2020 and the introduction of the ICD-11 codes of BED, we had to include all EDs aside from AN and BN into one heterogeneous group of diagnoses ‘Other EDs’. This was needed to ensure consistency over the time period and to avoid the problem of small group sizes in the regression analysis. Furthermore, given that EHRs include routine clinical data not primarily collected for research purposes, the study relies on clinician documentation which could include non-grammatical errors, jargon and idiosyncratic abbreviations; all of these could increase the chance of NLP misclassification.35 However, this was mitigated by using all documents available for each patient. Therefore, there were multiple opportunities to capture suicidality information to compensate for lack of sensitivity of the tool. Finally, the data rely on recording of suicidality and SH following a clinical encounter. This is likely to result in some heterogeneity at a document level, as some healthcare professionals may be more likely to discuss or record SH or suicidal thoughts depending on their level of experience, clinical background or their prior knowledge of the patient. However, as there only needed to be one positive mention of SH or one positive mention of suicidality, at a patient level, the threshold was low for detection of either outcome.

Clinical and research implications

This study highlights the importance of risk assessment screening in all patients diagnosed with EDs, with a particular emphasis on those diagnosed with AN and ED patients with comorbid psychiatric diagnoses. This study also highlights the potential use of EHR databases to further suicidality and SH research by using NLP techniques. These tools could potentially have use with further development in risk prediction within ED services; their use along clinician reported decisions could help predict future suicidal behaviour in ED patients.13 30

Data availability statement

Data are available on reasonable request. Data are available on request due to privacy/ethical considerations.The data are made available under specific governance requirements: researchers need to have a contract with the South London and Maudsley NHS Trust, which can be applied for relevant research studies. Each research project is reviewed by a service-user led oversight committee of the National Institute of Health Research Biomedical Research Centre. On request, and after appropriate arrangements, the data and modelling employed in this study can be viewed within the secure system firewall.

Ethics statements

Patient consent for publication

Ethics approval

The CRIS database has received ethical approval for secondary analysis: Oxford REC C, reference 18/SC/0372.



  • Twitter @charlot1903, @rina_dutta

  • Contributors CC led the project, conducted the data analysis and wrote the final manuscript; RD supervised and supported with the title, analysis and final manuscript. US contributed with the final manuscript and topic expertise. AB, SV and HS conducted data extraction and analysis. AS and KV conducted data analysis and support with final manuscript. CC is the guarantor.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.RD was funded by a Clinician Scientist Fellowship (research project e-HOST-IT) from the Health Foundation in partnership with the Academy of Medical Sciences. This work was supported by the the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.