Objective Our primary objective was to identify cognitive behavioural therapy (CBT) delivery for people with psychosis (CBTp) using an automated method in a large electronic health record database. We also examined what proportion of service users with a diagnosis of psychosis were recorded as having received CBTp within their episode of care during defined time periods provided by early intervention or promoting recovery community services for people with psychosis, compared with published audits and whether demographic characteristics differentially predicted the receipt of CBTp.
Methods Both free text using natural language processing (NLP) techniques and structured methods of identifying CBTp were combined and evaluated for positive predictive value (PPV) and sensitivity. Using inclusion criteria from two published audits, we identified anonymised cross-sectional samples of 2579 and 2308 service users respectively with a case note diagnosis of schizophrenia or psychosis for further analysis.
Results The method achieved PPV of 95% and sensitivity of 96%. Using the National Audit of Schizophrenia 2 criteria, 34.6% service users were identified as ever having received at least one session and 26.4% at least two sessions of CBTp; these are higher percentages than previously reported by manual audit of a sample from the same trust that returned 20.0%. In the fully adjusted analysis, CBTp receipt was significantly (p<0.05) more likely in younger patients, in white and other when compared with black ethnic groups and patients with a diagnosis of other schizophrenia spectrum and schizoaffective disorder when compared with schizophrenia.
Conclusions The methods presented here provided a potential method for evaluating delivery of CBTp on a large scale, providing more scope for routine monitoring, cross-site comparisons and the promotion of equitable access.
- cognitive behavioural therapy
- CBT for psychosis
- electronic Health records
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Strengths and limitations of this study
Key strengths of this study were the large sample and the innovative approaches adopted to identify cognitive behavioural therapy for psychosis (CBTp) delivery within the clinical record.
The ability to replicate the inclusion criteria of two previous audits also allowed us to contextualise the findings, and the large data set allowed access to data by year and also to examine clinical and demographic factors influencing delivery, identifying inequalities in access that are not detectable in smaller samples.
The use of routine data and automated ascertainment provides the scope for more in-depth evaluation of real-world treatment delivery and success, and the wider use of other EHR-derived data to investigate predictors of treatment receipt and outcome.
A limitation of this study was that it took place in a single (although large) service provider; however, our results have identified themes that are consistent with other findings in relation to CBTp provision.
This approach does not provide an assessment of quality of treatment, its specific therapeutic focus or its duration.
This approach does not identify offers of CBTp that are not taken up.
Pharmacotherapy as monotherapy for people with a diagnosis of psychosis or schizophrenia is no longer regarded as optimal treatment. The implementation of cognitive behavioural therapy for psychosis (CBTp) is of international concern and relevance,1 and CBTp, given its evidence base, is recommended in many countries including Australia and New Zealand,2 Canada,3 Spain4 and the USA.5 This paper is focused on the provision of CBTp at a single UK site, but the challenges associated with monitoring and improving the implementation of CBT for service users with psychosis have international relevance. For England and Wales, the NICE national guideline recommends that psychological therapies, in particular CBTp and family intervention, should be offered; NICE makes the recommendation they are offered to all people with the diagnosis of schizophrenia and their carers.6 However, repeatedly, within the UK, service users, charities such as Rethink,7 policy makers and audits8 ,9 have reported that only a small proportion of people are accessing these treatments. For example, the Schizophrenia Commission reported that only about 10% of service users access CBTp.10 To address these concerns, the Department of Health and National health service (NHS) England are undertaking various initiatives, including the Improving Access to Psychological Therapies for Severe Mental Illness11 programme and the new Early Intervention Access and Waiting Time initiative12 both of which aim to drive up access. However, one area of uncertainty that will limit evaluation of progress is whether we do have accurate baseline estimates of current levels of provision. A recent national audit (National Audit of Schizophrenia 2 (NAS2))13 taking a random sample of 100 service users with a diagnosis of schizophrenia or schizoaffective disorder in the community in each of 64 participating mental health trust or health boards in England and Wales concluded that there are significant gaps in the availability of CBTp and family interventions. For example, this manual case note audit found that trusts reported that on average 39% of service users had been offered CBTp and 19% of service users had taken up CBTp. However, there are grounds for thinking that the NAS2 audit might be inaccurate. The audit provided no definition or criteria for psychological therapy provision, asked whether a service user had ever been offered or received therapy and was based on reports by consultant psychiatrists. The audit report noted that responses probably encompassed a broader set of interventions than covered by the NICE recommendations. In contrast, a detailed manual survey of a random sample of 187 records reported a very much lower rate of offers (6.9%) and delivery (6.4%) of CBTp,14 employing expert reviews of reported therapy record content within a 1-year period in one large mental health trust.
Manually conducted audits of case notes and electronic records, such as NAS2, requiring individual responses of health professionals, are a labour intensive way of establishing these data, limit the number of cases that can reasonably be investigated and are too cumbersome to use routinely as practical tools to monitor service-level implementation. The UK’s national minimum data set15 does not currently require interventions to be recorded, although this may change. Although in the South London and Maudsley NHS Foundation Trust (SLaM), the site for this analysis, a structured drop-down record for psychological interventions in electronic records is available, there is concern that, as non-mandatory, it is incomplete and unreliable as a means to monitor activity.
In the current study we therefore sought to develop a method of using automated text-based searches of clinical records using natural language processing (NLP) techniques, supplemented by information from structured fields, to investigate how much this might enhance our ability to provide accurate routine automatic data reports and analysis, and thus provide an efficient method of monitoring the implementation of psychological therapy provision, overcoming the limitations of manual case note audits. The decision to focus initially on CBTp delivery instead of CBTp offer was a pragmatic one based on the perceived complexity and the resultant time required for each project.
The primary research question of the study was whether we could identify, with sufficiently high positive predictive value (PPV) and sensitivity, CBTp delivery using free text and structured methods in a large electronic service user record database. We also examined how many and what proportion of service users according to inclusion and exclusion criteria employed in published audits, with a case note diagnosis of schizophrenia or psychosis were recorded as having received CBTp within their episode of care using the CRIS database, during defined time periods, combining NLP and structured records. We then compared these data with the results of two published audits. Finally we examined whether demographic characteristics differentially predicted the receipt of CBTp.
SLaM is a large provider of mental healthcare, serving a catchment of around 1.3 million residents in four boroughs of south London (Croydon, Lambeth, Lewisham and Southwark). The majority of people with a diagnosis of a schizophrenia spectrum disorder are served by Early Intervention (EI) teams for the first 3 years from initial presentation and by Promoting Recovery (PR) teams subsequently.
Source of clinical data
The data for this study were obtained from the SLaM Biomedical Research Centre (BRC) Case Register and its Clinical Record Interactive Search (CRIS) application,16 which accesses anonymised data from the electronic health records (EHR) of individuals who have previously received or are currently receiving mental healthcare from SLaM within a robust, service user-led governance framework.17 At the time of writing, this is over 265 000 service user records. We used CRIS to replicate the inclusion criteria of NAS2 and Haddock et al 14 as means of comparison with these two published audits. The SLaM BRC Case Register contains structured fields, such as those coding demographic information, as well as unstructured (but de-identified) free text fields from case notes and correspondence where history, mental state examination, diagnostic formulation and management plan are primarily recorded. The CRIS data resource has been approved for secondary analysis by the Oxfordshire Research Ethics Committee,18 and a service user-led oversight committee considers all proposed research before access to the anonymised data is permitted. The EHR system was implemented in SLaM services in April 2006.
Overview of methodology
The initial step was to identify the delivery of CBT across all patient groups not distinguished by diagnostic groups or other characteristics and then subsequently, and as the specific focus of this study, to test the performance of the application for the delivery of CBT with a sample of service users with a diagnosis of psychosis (ie, ‘CBTp’).
Identification of CBT delivery using CRIS
NLP techniques19 were used to identify CBT delivery from free text fields within the BRC Case Register. The annotation strategy to identify whether a clinical record was an actual session of CBT was developed by three human annotators (CC, LE and MB), who also completed the initial feasibility, which was signed off by an expert clinical lead (PAG). All annotations were double-annotated by two human annotators, and disagreements were resolved by consensus and liaising with the clinical lead, if required. Inter-annotator agreement was evaluated following each batch of annotations completed, and the annotation strategy was updated according to issues raised and clarifications identified. Two annotators reviewed a training set of 300 instances in the development phase before annotating a gold standard data-set of 200 where the term ‘CBT’ (or variants of) occurred and annotated as to whether the sentence that contained the term ‘CBT’ was an actual session of CBT rather than a historic reference to therapy, a referral for CBT, a decision not to offer CBT or another reference to CBT that was not a therapy session. When a positive instance of CBT delivery was identified, the following features were recorded: session number, stage of treatment, the recipient of treatment and whether the CBT was delivered individually or via a group. Once the human annotations were complete, the training set was reviewed by the NLP developer (DC) to establish the rules to determine whether the CBT text is an actual session or not. These rules were coded using General Architecture Text Engineering software.20 Within the development process, the impact of the rules applied to the training set were measured by the PPV and sensitivity. There is an inherent trade-off between the PPV and sensitivity (as one increases, the other reduces) so there is a balance between what is more important in relation to the problem domain. We concluded that for this study an evenly weighted solution was preferred with a slight preference to PPV. When PPV is prioritised, this results in false positives being minimised, which increases the confidence in the test to correctly identify the positive outcome at the expense of incorrectly classifying some positive instances as false negatives (FNs).
When all the rules were developed based on the training set, the model was tested against an independent gold standard data set to evaluate how well the model performed on unseen data using PPV and sensitivity as the metrics of evaluation. Once the mean of the PPV and sensitivity on the gold standard were greater than 85%, the resulting application was applied against the CRIS database, and we further tested whether combining the NLP output with other relevant variables such as the professional group of the clinician who entered the clinical note, whether the clinical note was classified as a psychological therapy in structured data drop down menu or whether the positioning of the CBT reference in the clinical document could be used to improve the performance of the application.
Identification of CBTp delivery using CRIS
The output of the CBT application was generated in a sample of service users with a current diagnosis of psychosis to evaluate whether the PPV and sensitivity were of an acceptable standard or whether a specific CBTp application would need to be developed.
Within SLaM, psychological interventions can be recorded through a drop-down box within the clinical record, but as a non-mandatory field, the recording was considered as potentially poor. To assess the quality and use of this field, a senior clinician completed an assessment of 100 documents where CBT was indicated within the drop-down box, identifying whether the text associated with the document could be confirmed as a session of CBT.
Both free text and structured methods of identifying CBT were combined to create a single set of results, which was used for analysis purposes. As the focus of this paper is to identify the delivery of CBT for patients with a diagnosis of psychosis, the term ‘CBTp’ is used from this point forward.
We used the CRIS database to generate two large participant samples in this study: one replicating the inclusion criteria and the sampling time frame employed by the NAS2 audit and a second that replicated the Haddock et al 14 audit inclusion criteria, allowing a comparison with each publicly available study.
1. NAS2 audit inclusion criteria
All individuals ‘active’ (ie, receiving services rather than discharged from care) for at least 12 months on 1 July 2013 aged over 18 years receiving either an EI or a PR service, with a recorded diagnosis of schizophrenia (F20.0–F20.9) or schizoaffective disorder (F25.0–F25.9). The NAS2 audit requested whether CBTp was ‘taken up’, and we examined this in two ways: service users with at least one session of CBTp and service users with at least two sessions of CBTp prior to the census date.
2. Haddock et al audit inclusion criteria
All individuals active between 1 July 2012 and 1 July 2013 aged over 18 years receiving either an EI or a PR service, with a recorded diagnosis of schizophrenia spectrum diagnosis (schizophrenia, schizoaffective, schizotypal and delusional disorders (F20.0–F29.9)). CBTp delivery was defined as at least one session of CBTp within the 12-month audit period.
In addition to the original timeframe, we resampled the data Haddock et al 14 inclusion criteria for a separate 12-month timeframe in 2015 to check the robustness of the findings related to health inequalities.
If patients met the inclusion criteria across multiple teams within the same service type, to avoid double counting, the episodes were merged by selecting the earliest episode start date and latest end date for those episodes and presented as a single episode of care.
Demographic and service variables
The following variables were extracted for analyses: age, diagnosis, ethnicity, gender, marital status and service type. All data obtained were the most recent prior to the census date. Ethnicity was recorded according to categories defined by the UK Office for National Statistics and categorised for analysis purposes into three groups: black (comprising black African, black Caribbean and any other black background), other (comprising white and black African, white and Asian, white and black Caribbean, any other mixed background, Indian, Pakistani, Bangladeshi, any other Asian background, Chinese and any other ethnic group) and white (comprising white British, white Irish and any other white background). Marital status was aggregated into two groups: single/divorced (including dissolved civil partnerships and widowed) and married/cohabiting/civil partnerships. Diagnosis is routinely recorded in clinical services using the International Classification Disease version 10 (ICD-10) classification system in drop-down fields and was limited to schizophrenia spectrum (F20–F29), with an additional subgrouping applied in line with the NAS2 diagnostic categories of schizophrenia (F20.0–F20.9), schizoaffective disorder (F25.0–F25.9) and ‘other schizophrenia spectrum’ (F21, F22.0–F22.9, F23.0–F23.9, F24, F28 and F29). We used the largest sample (using the Haddock el al 14 inclusion criteria) to investigate the delivery of CBTp across the following categories: age group, diagnosis, gender, ethnic group, marital status and whether the patient was in contact with either the EI or PR service.
Descriptive statistics for demographic variables are reported as means and SD for continuous variables (age at referral) and as frequencies and percentages for all other variables. A binary logistic regression model was used to examine the differences for proportions of cases who received CBTp and those who did not. We initially undertook an unadjusted analysis by age group, diagnosis, ethnicity, gender, marital status and service type to establish whether the receipt of CBTp differed by these demographic factors. We subsequently undertook a multivariable analysis, adjusting for potential confounders by including covariates (age, diagnosis, ethnicity, gender, marital status and service type) in the model except the variable of interest. Due to the relationship between age and service type (EI services are by definition for a younger patient group), we included the partially adjusted model that excludes service as a predictor to check whether the increased likelihood of younger people receiving CBTp is still present.
PPV and sensitivity of identification of CBT in case records
The developed NLP CBT delivery application was evaluated against the independent gold standard resulting in PPV and sensitivity for CBT annotations of 85% and 86%, respectively. Following the development of the CBT NLP application, we concluded the PPV would be improved with a tolerable reduction in sensitivity if we applied the following postprocessing rule: to exclude CBT sentences that commenced after the first 200 characters of the clinical document. This postprocessing rule resulted in an improved overall performance of the application, with an increase in PPV of 12% to 97% and a reduction in sensitivity of 4% to 82%. The evaluation of the structured CBT entry alone resulted in a PPV of 89%. We then combined both methods, and a measure was adopted to establish the sensitivity of the combined method by reviewing the FNs from the NLP app and examining whether they were identified by the structured method: of the 12 FNs identified by the NLP app, 75% (9/12) were correctly identified by the structured data with the effect of increasing the sensitivity from 82% (56/68) for the NLP app alone to 96% (65/68) for the combined method. By combining methods, we therefore achieved a PPV of 97% and a sensitivity of 96%. The NLP app resulted in identifying 26% additional service users who received CBT not recorded by the drop-down box.
PPV and sensitivity of identification of CBTp in case records
We further evaluated the developed NLP CBT delivery application against a sample of service users with a diagnosis of psychosis. The performance against the independent gold standard resulted in PPV and sensitivity for CBTp annotations of 81% and 85%, respectively. Applying the above-mentioned postprocessing rule (to exclude CBTp sentences that commenced after the first 200 characters of the clinical document) resulted in an increase in PPV of 14% to 95% and a reduction in sensitivity of 7% to 78%. The evaluation of the structured CBT entry alone resulted in a PPV of 89%. Having combined both methods, of the 10 FNs identified by the NLP app, 80% (8/10) were correctly identified by the structured data, with the effect of increasing the sensitivity from 78% (36/46) for the NLP app alone to 96% (44/46) for the combined method. By combining methods, we therefore achieved a PPV of 95% and sensitivity of 96%. The NLP app resulted in identifying 21% additional service users who received CBTp not recorded by the drop-down box.
Delivery of CBTp using sample based on NAS2 inclusion criteria
Two thousand three hundred and eight service users were identified in the data set as fulfilling the NAS2 inclusion criteria. Service users had a mean age of 40.7 at referral (SD 12.1; range 18–83), 60.3% (1392/2308) were male, 51.9% (1197/2308) were of a black ethnic origin and 34.6% (799/2308) were from a white ethnic origin, 90.7% (2094/2308) were single/divorced, 78.2% (1806/2308) had a diagnosis of schizophrenia and 21.8% (502/2308) had a diagnosis of schizoaffective disorder.
The SLaM return for the actual NAS2 audit was that 20% of the random sample of n=100 were identified as having ever received CBTp. In contrast, using the current method, 34.6% (799/2308) were identified as having at least one session and 26.4% (610/2308) were identified as having at least two sessions of CBTp. A breakdown of CBTp delivery by diagnostic group can be viewed in table 1.
We also explored the level of CBTp provision by year, which can be viewed in figure 1.
Delivery of CBTp using sample based on Haddock et al inclusion criteria
Two thousand five hundred and seventy-nine service users fulfilled the inclusion criteria within the same 12-month audit period. Service users had a mean age of 40 at referral (SD 12.4; range 18–83), 60.3% (1555/2579) were male, 50.9% (1314/2579) were of a black ethnic origin and 35.2% (908/2579) were from a white ethnic origin, 90.5% (2339/2579) were single/divorced, 70.0% (1806/2579) had a diagnosis of schizophrenia and 19.5% (502/2579) had a diagnosis of schizoaffective disorder. We found that 12.8% (330/2579) received CBTp interventions within the same 12-month audit period, whereas Haddock et al 14 reported 6.4% (12/187) in their sample.
We also examined a more recent time-period: 2597 service users fulfilled the inclusion criteria within a 12-month audit period within 2015. Service users had a mean age of 39.6 at referral (SD 12.7; range 18–85), 60.4% (1568/2597) were male, 52.3% (1357/2597) were of a black ethnic origin and 32.1% (883/2579) were from a white ethnic origin, 90.5% (2351/2597) were single/divorced, 63.4% (1646/2597) had a diagnosis of schizophrenia and 20.0% (519/2597) participants had a diagnosis of schizoaffective disorder. We found that 14.8% (385/2597) received CBTp interventions within the 12-month audit period.
We additionally investigated the proportion of participants that received CBTp ‘year on year’, by checking if the participants who took part in the audit in 2015 also received CBTp in the 2013 audit. This check found that 13.8% (53/385) of the participants who received CBTp in 2015 had also received CBTp in 2013.
Demographic predictors of at least one session of CBTp
The demographic characteristics of service users who received CBTp were compared with those who did not using our largest sample of n=2579, which employed the Haddock et al 14 inclusion criteria. The receipt of CBTp was more common in younger service users, in the white compared with the black group, in those in the schizoaffective disorder group compared with those in the schizophrenia group, and in those receiving care from the EI for psychosis teams rather than the PR teams. Table 2 provides a summary of the unadjusted and adjusted multivariate logistic regression for receipt of CBTp by age group, diagnostic group, ethnic group, gender, marital status and service type.
We additionally explored the number and percentage of participants who received CBTp by the standard NHS 16 ethnic groups to further detail the ethnic composition and CBTp provision, which can be viewed in table 3.
To our knowledge, this is the first attempt at using NLP techniques on free text entries, supplementing structured fields, in order to identify the delivery of one type of psychological therapy in a large health record data set. This was broadly successful, in that we achieved a high level of PPV (95%) and sensitivity (96%) that is consistent with published CRIS NLP applications, which have measured other clinical activities or characteristics such as prescribed medication,21 Mini-Mental State Examination score,22 diagnosis23 and service user characteristics, such as smoking status24 and whether the service user lived alone.16 The methods presented here are therefore potentially effective and efficient for examining the delivery of CBTp on a large scale where manual audits are inevitably limited in sample size for logistical reasons.
NLP applications are increasingly being used to extract information from medical records for a wide range of health-related areas including but not limited to the detection of adverse drug events, falls, nosocomial infections,25–27 obesity status and obesity-related diseases28 29 and detecting patterns in patient care and patient treatment habits30 31 that highlight the potential for NLP to supplement other data collection methods. NLP applications for mental health services are less prominent, but there have been recent studies in the USA that used NLP to determine depression outcome, adverse drug reactions and characterisation of diagnostic profiles.32–34
When using this method, we identified higher levels of CBTp delivery than previously reported in the SLaM contribution to the NAS2 audit using the same sampling criteria but a very much larger sample. Note the published audits using NAS2 and Haddock et al 14 inclusion criteria differ on timeframe, diagnosis and interpretation of CBTp delivery. We also found higher levels of CBTp delivery (about double) than that reported by Haddock et al 14 in the same time period although in a different service setting. This suggests that manual audits may result in under-reporting, presumably because of the limitations of clinician knowledge or readily accessible recording in health records, and our development is encouraging because it may result in both better quality output and much less time-intensive data collection. It is notable that the NAS2 audit enquired whether CBTp had ever been provided: the methods described here can search by year, which is clinically more useful; the data also might suggest that clinicians in responding to such an audit are typically considering perhaps the previous 2 years. Furthermore, when we conducted the sampling twice for 2013 and 2015, we found some evidence of a modest increase in provision—from 12.8% to 14.8%. However, our results also continue to show that CBTp delivery falls very far short of the NICE recommendations of universal access. It is a matter of additional importance and concern that there do appear to be demographic predictors, suggesting access is inequitable in terms of age, diagnosis and ethnicity. Improving access to psychological therapies can be enhanced by examining data such as these and targeting provision towards underserved groups. The value of informatics to monitor the delivery of psychological therapy provision and the advantages described here are important for health systems internationally.
Key strengths of this study were the large sample and the innovative approaches adopted to identify CBTp delivery within the clinical record. The ability to replicate the inclusion criteria of two previous audits also allowed us to contextualise the findings, and the large data set allowed access to data by year and also to examine clinical or demographic factors influencing delivery. Clearly, there are also a large number of other variables in the EHR that are potentially available for examination without the need to repeat data extraction, as would be the case in a manual audit. These might include service user characteristics, service delivery settings, therapist characteristics and aspects of therapy provision such as assessments, number of sessions, discontinuation and drop out and clinical outcomes. The large sample size generated by this approach has enabled us to identify previously unknown inequalities in the provision of CBTp within our own trust that we have taken steps to address, such as raising with the senior team and the provision of regular monitoring reports split by demographic variables shared with clinical teams.
A limitation of this study was that it took place in a single (although large) service provider; however, our results have identified themes that are consistent with other findings in relation to CBTp provision and could indicate generalisability but would warrant further investigation. The sample presented here is reflective of the local service provision, although SLaM services may benefit from some research-funded clinical activity, the extent of which may differ to other services within the UK and internationally. However, other countries such as Australia and New Zealand,2 Canada,3 Spain,4 UK and USA5 have policies that recommend CBTp provision and therefore monitoring implementation of these policies is of international importance. If other services were interested in adopting the method described here to identify CBTp, we would recommend that a full de novo evaluation of the application performance be undertaken as it cannot be assumed that performance on one cohort would be directly generalisable to others.16
A further limitation concerns the use of routine clinical data rather than de novo data collection. Clearly, the information available is limited by what is recorded in the source records. For fully EHR, such as those that are now used routinely in UK mental health services, there are no other information repositories that provide administrative or medico-legal back-up, and therefore there are incentives for clinicians to record details of interventions, in order to provide evidence that these did actually take place. We believe that we were able to identify relevant CBT treatment receipt through the search approach used identified through querying both structured and text fields; indeed, demonstrating that additional querying of text fields identified significantly larger numbers of episodes. However, we are not at this stage able to automate the identification of more subtle and nuanced descriptions of the treatment and its context; that is, we could not identify the ‘offer’ rather than receipt of CBT, because of the wide range of wording used to record this, and we did not attempt to quantify the quality or nature of treatment received. It is possible that future advances in NLP may allow the automated ascertainment of these constructs, but it is possible that de novo data collection and/or manual case note evaluation will remain the only solutions, although limited in the samples that can be generated.
Clearly an alternative approach would be to impose data collection on clinicians by requiring them to complete structured assessments to delineate the process of offering, commencing and monitoring treatment. This would obviate the need for NLP approaches; this, however, depends on clinicians’ willingness to complete these instruments and for the approach to sustain itself over time, potentially problematic if clinicians also have to complete text fields for what may be seen as a more salient need to communicate information on sessions for their own and colleagues’ future reference as well as for medico-legal purposes. It therefore seems likely that medical records data will remain a mixed economy of structured and text-derived information and that audits will incorporate a mixture of large-scale, multi-site ‘big data’ analyses and targeted in-depth case-note review.
The opportunity provided by employing methods shown here allows the proactive analysis of large EHR-derived data sets. In the future, a refinement could be to identify CBTp delivery data by using data from NLP and structured fields to identify a course of CBTp treatment. Initial definitions regarding the development of a course of treatment would require at least two CBTp sessions with less than a 3-month break between sessions and in addition using other NLP features such as the CBTp session number and stage of therapy to enhance the creation of such a construct. We are also now working on developing an application that identifies the delivery of other therapy types and applications that more precisely characterise the pathway from CBTp being considered through its offer and to actual receipt.
Contributors PAG and RS conceived the study and manuscript. CC, LE, MB, DC, AK, PAG and RS provided substantial contributions to the design of the work. Analyses were carried out by CC, DC, LE and MB. CC and PAG initially contributed significant text to the study manuscript. CC, PAG, TJC and RS provided the final approval of the version to be published.
Funding This work was supported by the Clinical Records Interactive Search (CRIS) system and funded and developed by the National Institute for Health Research (NIHR) Mental Health Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London and a joint infrastructure grant from Guy’s and St Thomas’ Charity and the Maudsley Charity (grant number BRC-2011-10035). PAG and TJC acknowledge BRC support. CC and LE are funded by SLAM. All other authors receive salary support from the National Institute for Health Research (NIHR) Mental Health Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London. The above funding bodies had no role in the study design; in the collection, analysis and interpretation of the data; in the writing of the report; and in the decision to submit the paper for publication.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional unpublished data are available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.