Article Text

Download PDFPDF

Recorded poor insight as a predictor of service use outcomes: cohort study of patients with first-episode psychosis in a large mental healthcare database
  1. Neha Ramu1,
  2. Anna Kolliakou1,
  3. Jyoti Sanyal2,
  4. Rashmi Patel1,2,
  5. Robert Stewart1,2
  1. 1 Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK
  2. 2 South London and Maudsley NHS Foundation Trust, King’s College London, London, UK
  1. Correspondence to Dr Anna Kolliakou; anna.kolliakou{at}


Objectives To investigate recorded poor insight in relation to mental health and service use outcomes in a cohort with first-episode psychosis.

Design We developed a natural language processing algorithm to ascertain statements of poor or diminished insight and tested this in a cohort of patients with first-episode psychosis.

Setting The clinical record text at the South London and Maudsley National Health Service Trust in the UK was used.

Participants We applied the algorithm to characterise a cohort of 2026 patients with first-episode psychosis attending an early intervention service.

Primary and secondary outcome measures Recorded poor insight within 1 month of registration was investigated in relation to (1) incidence of psychiatric hospitalisation, (2) odds of legally enforced hospitalisation, (3) number of days spent as a mental health inpatient and (4) number of different antipsychotic agents prescribed; outcomes were measured over varying follow-up periods from 12 months to 60 months, adjusting for a range of sociodemographic and clinical covariates.

Results Recorded poor insight, present in 46% of the sample, was positively associated with ages 16-35, bipolar disorder and history of cannabis use and negatively associated with White ethnicity and depression. It was significantly associated with higher levels of all four outcomes over all five follow-up periods.

Conclusions Recorded poor insight in people with recent onset psychosis predicted subsequent legally enforced hospitalisations and higher number of hospital admissions, number of unique antipsychotics prescribed and days spent hospitalised. Improving insight might benefit patients’ course of illness as well as reduce mental health service use.

  • insight
  • psychosis
  • mental health outcomes
  • service use outcomes
  • natural language processing
  • cris

This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See:

View Full Text

Statistics from

Strengths and limitations of this study

  • Our study included a large sample size, followed a naturalistic method of cohort identification and follow-up and applied natural language processing, a novel text extraction method, to ascertain insight.

  • Measurement of insight (as a binary fixed variable) depended on this clearly having been stated in the clinical record and cannot be assumed to be identical to assessment through interview.

  • Follow-up assessments were only feasible for those cases remaining in the geographic catchment area served by the Trust.

  • Causal pathways between insight and clinical outcomes cannot be determined by our analysis.


Schizophrenia and other psychotic disorders have potentially severe impacts both on individuals and society, although their course and prognosis are variable. The concept of insight has historically been challenging to define and measure. Currently, researchers and clinicians use long and short cognitive and clinical assessment schedules which measure unidimensional and multidimensional aspects of insight such as awareness of illness and its consequences, attribution of symptoms, acceptance of treatment and understanding of its effects1 2 as well as cognitive notions such as self-reflection and self-certainty.3 Other views additionally propose that insight depends on cognitive functioning and on a patient’s cultural and life experiences that cannot accurately be measured through traditional objective assessments.4

The awareness and appreciation by an individual of their psychopathology has long been considered a determinant of outcome. Patients with poor insight are less likely to understand their illness; hence, have been found to be less likely to adhere to treatment5–7 and/or require more extensive treatment.8 Many studies have concluded that poor insight is associated with stigma and worse social performance; however, some have claimed that insight is not in fact directly linked to the outcome of the illness but how it is progressing.9 Self-reported quality of life has been found to be higher in patients with poor insight; this has been suggested as secondary to delusional beliefs,10 11 although good insight has been found to be associated with higher risk of depression in people with schizophrenia12 and with suicidality.13 14 On the other hand, poor insight in schizophrenia has been associated with higher anxiety,15 with obsessive/compulsive symptoms,16 and with violent behaviour in some17 but not in all18 studies. In mania, poor insight has been associated with elation rather than irritability or psychosis.19 However, others have concluded that there are no associations of insight either with symptoms or progression of schizophrenia.20

Despite the range of studies exploring insight in psychotic disorders, we could find no direct investigations of associations with service use outcomes. In a large mental healthcare data resource, we therefore sought to develop a means of extracting descriptions of insight from the text fields of clinical records and investigated whether recorded poor insight early after a first clinical presentation with psychosis predicted increased subsequent service use.


Setting and data sources

The data used in this study were obtained from the South London and Maudsley National Health Service Foundation Trust (SLaM), one of Europe’s largest mental healthcare organisations which provides comprehensive services across all ages and specialties to a defined geographic catchment of around 1.2 million residents within four south London boroughs (Lambeth, Southwark, Lewisham and Croydon). SLaM has used fully electronic health records for over 10 years and its Clinical Record Interactive Search (CRIS) tool, set up in 2008,21 22 allows researcher access to deidentified data from the full record within a robust governance framework.23

Exposure of interest and data extraction

CRIS has been substantially enhanced through natural language processing algorithms applied to extract constructs of interest from text fields in the source record using information extraction/named entity recognition techniques.22 24 For this study, Text Hunter annotation software25 was used to create training and test corpora classifying mentions of insight in the clinical record to train a supervised machine learning algorithm to recognise this automatically across the wider sample. An initial keyword search was carried out to extract sentences containing the word ‘insight’, and a human annotator manually categorised these as either ‘good insight’ (for example, when insight was described as ‘clear’, ‘improving’, ‘partial’, ‘good’, ‘insightful’, ‘present’, ‘intact’ and ‘aware’), ‘poor insight’ (eg, described as ‘lacking’, ‘poor’, ‘limited’, ‘insightless’, ‘absent’, ‘impaired’, ‘lost’ or words to that effect) or as not relevant (ie, unclear/lengthy descriptions, unassessed insight, insight mentioned as a future goal rather that at the present or where the level of insight was not immediately obvious). For generating training and independent test sets the algorithm, a randomly selected 1814 relevant sentences were manually annotated from all patients on CRIS with a previous diagnosis of schizophreniform or affective disorder (International Classification of Diseases 10th Revision (ICD-10) F2x or F3x), of which 788 were classified as having good insight, 826 as having poor insight and 200 as non-relevant statements. Precision (positive predictive value) and recall (sensitivity) were used as performance metrics based on conventional practice in text extraction evaluation.26 The algorithm generated classified ‘poor insight’ instances with 0.73 precision (positive predictive value) and 0.83 recall (sensitivity) against the manual gold standard.


For the analysis, a database was used which had been previously prepared via CRIS for an analysis of psychosis outcomes associated with cannabis use.27 In summary, this comprised all 2026 individuals with first-episode psychosis who were accepted by a SLaM early intervention (EI) service between 1 April 2006 and 31 March 2013. Criteria for accepting patients in SLaM EI services follow those outlined in the ‘Standards for Early Intervention in Psychosis Services—First Edition’.28 Outcome data were collected up to 31 March 2014. All participants were assessed for outcomes within 12 months of the date of being accepted to an early intervention service (2026 person-years). Participants with sufficient follow-up data were also assessed for outcomes within 24 months (n=1738; 3476 person-years), 36 months (n=1461; 4383 person-years), 48 months (n=1185; 4740 person-years) and 60 months (n=926; 4630 person-years). Predictor, covariate and outcome variable data were obtained via CRIS. Besides insight, the following covariates were ascertained using values recorded closest to the date of being accepted by an early intervention service: age, gender, ethnicity, marital status, employment status and type of accommodation, primary diagnosis and cannabis use. Ethnicity was recorded according to categories defined by the UK Office for National Statistics and was condensed for this analysis into four groups (white, black, Asian, other). Diagnosis was recorded using the ICD-10 classification system. The derivation of cannabis use through natural language processing and its application as a covariate have been previously described.27 Using the natural language processing algorithm described above, recorded poor insight was ascertained from case records within 1 month either side of the date each patient was accepted to the early intervention service, and this was defined as the primary exposure.


We investigated the association between poor insight and the following mental healthcare outcomes: (1) number of psychiatric hospital admission, (2) any legally enforced (compulsory) admission under the UK Mental Health Act (MHA), (3) the number of unique antipsychotics prescribed (as a proxy measure of treatment failure) and (4) the number of days spent in psychiatric hospital over a given follow-up period. The MHA is a UK statute law which allows compulsory admission for up to 28 days (‘Section 2’) or up to 6 months (‘Section 3’). Antipsychotics used were ascertained both from structured fields and a natural language processing algorithm.22

Statistical analysis

All participants were assessed for outcomes within 12 months of the date of being accepted to an early intervention service. Those with sufficient follow-up data were then also assessed for outcomes within 24, 36, 48 and 60 months of this first acceptance date (ie, different but overlapping follow-up periods). This was an identical approach to that previous adopted for analyses in these data,27 investigating discrete periods of follow-up time rather than using survival analysis because of the non-proportionality of hazards. The sample was first described and factors associated with poor insight investigated. Regression models were then used to evaluate unadjusted and successively adjusted associations with the four outcomes over the five different follow-up periods. Owing to overdispersion, previously described for these data,27 we aimed to assess associations with number of hospital admissions and number of unique antipsychotic medications using multivariable negative binomial regression (zero inflation having been investigated but giving rise to no meaningful difference). However, one of the models failed to converge, and so Poisson regressions were used instead. Associations with legally enforced hospitalisation were assessed using multivariable binary logistic regression. Associations with number of inpatient days within given observation periods were investigated using multiple linear regression models. Reference groups for covariates were defined as those with the highest prevalence for each variable, and missing categories were included as predictor variables so that no patients were excluded because of missing covariate data. Stata software V.13 (StataCorp Stata Statistical Software: Release 13; StataCorp, 2011) was used.

Patient and public involvement

We did not directly incorporate patient and public involvement (PPI) into this particular analysis, but the SLaM Biomedical Research Centre Case Register used in the study was developed with extensive PPI and is overseen by committees that include service user and general public representatives.



From the cohort of 2,026 individuals, 927 (46%) had at least one recording of poor insight within one month either side of their registration with the early intervention service. The sample characteristics and their associations with recorded poor insight are summarised in table 1. This was more common in patients aged 26-35, in those without recorded employment and those using cannabis. It was least common in patients of White ethnicity, those without recorded relationship status and those with accommodation status recorded as ‘other’ or ‘council tenant’. Poor insight was most commonly recorded in bipolar disorder, schizophrenia and drug-induced psychosis, and least common in depression.

Table 1

Sample characteristics and associations with poor insight (n=2026)

Unadjusted and adjusted main outcomes

Associations with service use outcomes in unadjusted and multivariable analyses are described in tables 2–5. Higher numbers of hospitalisation episodes (table 2), higher odds of legally enforced hospitalisations (table 3), higher numbers of unique antipsychotics (table 4) and higher numbers of inpatient days (table 5) were all significantly associated with poor insight as measured at 12, 24, 36, 48 and 60 months. For proportions of patients (with present or absent poor insight) and each clinical outcome at 12, 24, 36, 48 and 60 months, please see online supplementary table 1.

Supplementary Table 1

Table 2

Association between poor insight and number of hospital admissions (negative binomial regression)

Table 3

Association between insight and legally enforced hospitalisation* (logistic regression)

Table 4

Association between insight and number of unique antipsychotics prescribed (Poisson regression)

Table 5

Association between insight and days spent hospitalised during the observation period (linear regression)


In a large cohort of cases with first-episode psychosis drawn from a mental healthcare database, we developed an algorithm to detect recorded poor insight and investigated this as a predictor of four subsequent service use outcomes. Rate of poor insight in our cohort (46%) was in the range of that reported by studies assessing it through routine data collection methods (~50%).29–32 Poor insight was, in summary, significantly and independently associated with higher number of hospitalisation episodes, higher odds of legally enforced hospitalisation, higher numbers of days spent as an inpatient and higher numbers of unique antipsychotic agents prescribed. Associations with these outcomes remained significant over the five follow-up time-points at 12, 24, 36, 48 and 60 months .

Loss of insight has long been considered a potentially important feature of psychotic disorders, and clearly establishing a therapeutic alliance is more challenging when insight is poor, accounting for associations found with reduced treatment adherence.7 8 On the other hand, reduced awareness of a mental disorder has been suggested to a reduced personal impact of that disorder, accounting for associations found with better self-rated quality of life10 11 and lower risk of depression and suicidality.12–14 It is therefore understandable that there has been some controversy over whether poor insight has prognostic relevance. Our study focused on a range of outcomes derived from mental healthcare records and, as described above, found these to be worse in people recorded as having poor insight during the first 5 years of their clinical care.

Strengths of the study include the large sample size and naturalistic nature of the cohort and follow-up. It has also demonstrated the great potential for natural language processing (NLP)applied to routine healthcare records in deriving novel information of clinical relevance. However, key limitations need to be borne in mind when interpreting the findings. Considering the measurement of insight, the performance of the NLP algorithm was judged to be satisfactory and clearly represents an important step forward in routine data collection (structured fields in case records invariably fail to record this construct thus rendering it invisible in conventional healthcare databases); furthermore, suboptimal measurement of insight would have obscured rather than exaggerated the prospective associations with the outcomes of interest. However, clearly, statements about insight have to be recorded in the first place, and there may be clinical circumstances and reasons which render these more or less likely. For example, clinicians may be biased to record insight when it is poor or noticeably absent but not when it is present. In addition, as we measured insight as a fixed binary variable, the construct cannot be assumed to be identical to an assessment of insight in a research interview, and we solely focused on recorded poor insight and did not seek to subcharacterise the sample into those with mixed good or poor statements. Additionally, the precision and recall rates still allow for a risk of false positive and false negative instances of poor insight, and further work could be employed to improve the performance metrics. In terms of follow-up, hospitalisations and other outcomes would only be ascertained for those cases which remained in the geographic catchment served by SLaM, so outmigration might have affected longer interval findings. In this analysis, as with a previous analysis of cannabis use as a risk factor in this sample,22 we investigated associations over different time periods. Longer follow-up evaluations clearly provide a more informed picture of prognosis; however, insight cannot be assumed to be constant over time and we did not attempt to quantify these trajectories—for example, more effective treatment may result in a virtuous cycle involving improved insight and better therapeutic engagement.

Residual confounding cannot be absolutely excluded, and causal pathways also remain to be elucidated; however, these might include failure to establish initial engagement with services resulting in symptomatic deterioration and requirement for inpatient care—particularly supported by the high use of legally enforced hospitalisation. It is possible that poor insight at first presentation is associated with antipsychotic treatment failure, as suggested by the higher number of antipsychotics used, although it is difficult to draw this conclusion with certainty because of potentially complex interactions between insight and treatment effects. Poor insight might place strains on social support networks and compromise the role of protective factors, accounting for the observed associations between poor insight and indicators of social/financial disadvantage in our cohort. It might result in risk behaviours which result in worse outcomes, although we adjusted for cannabis use as one of these potential pathways, and this did not account substantially for the associations observed. Finally, it is possible that poor insight is not a risk factor itself but is a marker of a disorder which is already more severe in other respects (such as symptomatically or in terms of functional deterioration). Importantly, this study focused on the relationship between insight recorded shortly after presentation and the outcomes of interest, and we did not seek to capture changes in insight over the follow-up periods; this would be a potentially useful further line of enquiry, although dependent on the extent to which fluctuations in insight are recorded in routine mental healthcare.


Our findings do support an important prognostic role for poor insight in people with psychotic disorders when this is mentioned early after first clinical presentation. Although economic modelling was not attempted, clearly, outcomes such as the number and duration of hospitalisation episodes, number of antipsychotics prescribed and legally enforced hospitalisations have substantial impact, and measures taken to improve insight might similarly bring important benefits at a service level as well as on individuals’ course of illness.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
View Abstract


  • Contributors The study was conceived by NR, RS and RP. Natural language processing applications were developed by NR, AK and JJ. Analyses were carried out by NR and AK. NR and AK led the preparation of the final report, to which all authors contributed significantly, approving the final version.

  • Funding RS, RP, JJ and AK are partly funded by the National Institute for Health Research Biomedical Research Centre and Dementia Biomedical Research Unit at South London and Maudsley NHS Foundation Trust and King’s College London. RP has received support from a Medical Research Council (MRC) Health Data Research UK Fellowship (MR/S003118/1) and a Starter Grant for Clinical Lecturers (SGL015/1020) supported by the Academy of Medical Sciences, The Wellcome Trust, MRC, British Heart Foundation, Arthritis Research UK, the Royal College of Physicians and Diabetes UK.

  • Competing interests None declared.

  • Ethics approval Oxford C Research Ethics Committee, reference 08/H0606/71+5.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement All data relevant to the study are included in the article or uploaded as supplementary information.

  • Correction notice This article has been corrected since it was published online. The license type has been updated from CC BY-NC to CC BY.

  • Patient consent for publication Not required.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles