Prevalence and identification of anxiety disorders in pregnancy: the diagnostic accuracy of the two-item Generalised Anxiety Disorder scale (GAD-2)

Objective To estimate the population prevalence of anxiety disorders during pregnancy and investigate the diagnostic accuracy of the two-item Generalised Anxiety Disorder scale (GAD-2) for a) GAD and b) any anxiety disorder. Design Cross-sectional survey using a stratified sampling design. Sampling weights were used in the analysis to adjust for the bias introduced by the stratified sampling. Setting Inner-city maternity service, South London. Participants 545 pregnant women were interviewed after their first antenatal appointment; 528 provided answers on the GAD-2 questions. Main outcome measures Diagnosis generated by the Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders, 4th edition (SCID). Results Population prevalence of anxiety disorders was 17% (95% CI 12% to 21%): 5% (95% CI 3% to 6%) for GAD, 4% (95% CI 2% to 6%) for social phobia, 8% (95% CI 5% to 11%) for specific phobia and 2% (95% CI 1% to 4%) for obsessive-compulsive disorder. Post-traumatic stress disorder (PTSD) prevalence was unclear due to higher levels of reluctance to respond to PTSD interview questions but sensitivity analyses suggest population prevalence maybe up to 4% (95% CI 2% to 6%). Weighted sensitivity of GAD-2 for GAD (cut-off ≥3) was 69%, specificity 91%, positive predictive value 26%, negative predictive value 98% and likelihood ratio 7.35. For any anxiety disorder the weighted sensitivity was 26%, specificity 91%, positive predictive value 36%, negative predictive value 87% and likelihood ratio 2.92. Conclusions Anxiety disorders are common but GAD-2 generates many false positives and may therefore be unhelpful in maternity services.


AbstrACt
Objective To estimate the population prevalence of anxiety disorders during pregnancy and investigate the diagnostic accuracy of the two-item Generalised Anxiety Disorder scale (GAD-2) for a) GAD and b) any anxiety disorder. Design Cross-sectional survey using a stratified sampling design. Sampling weights were used in the analysis to adjust for the bias introduced by the stratified sampling. setting Inner-city maternity service, South London. Participants 545 pregnant women were interviewed after their first antenatal appointment; 528 provided answers on the GAD-2 questions. Main outcome measures Diagnosis generated by the Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders, 4th edition (SCID). results Population prevalence of anxiety disorders was 17% (95% CI 12% to 21%): 5% (95% CI 3% to 6%) for GAD, 4% (95% CI 2% to 6%) for social phobia, 8% (95% CI 5% to 11%) for specific phobia and 2% (95% CI 1% to 4%) for obsessive-compulsive disorder. Posttraumatic stress disorder (PTSD) prevalence was unclear due to higher levels of reluctance to respond to PTSD interview questions but sensitivity analyses suggest population prevalence maybe up to 4% (95% CI 2% to 6%). Weighted sensitivity of GAD-2 for GAD (cut-off ≥3) was 69%, specificity 91%, positive predictive value 26%, negative predictive value 98% and likelihood ratio 7.35. For any anxiety disorder the weighted sensitivity was 26%, specificity 91%, positive predictive value 36%, negative predictive value 87% and likelihood ratio 2.92. Conclusions Anxiety disorders are common but GAD-2 generates many false positives and may therefore be unhelpful in maternity services.

IntrODuCtIOn
Anxiety disorders are more common in women than men, 1 2 and the perinatal period (ie, pregnancy and the year after birth) has been reported as a particularly vulnerable time for the onset or relapse of anxiety disorders in women. 3 4 Antenatal anxiety disorders have been associated with adverse pregnancy outcomes, including preterm birth, low birth weight, lower Apgar scores, postpartum anxiety and depression and adverse child developmental outcomes [5][6][7] including difficult temperament, increased sleep problems, bonding/attachment problems and poorer emotional, behavioural and cognitive development. [5][6][7][8][9][10] A recent systematic review and meta-analysis reported a pooled prevalence of 4% (95% CI 2% to 6%; 10 studies with pooled n=6910) for Generalised Anxiety Disorder (GAD) and 15% (95% CI 9% to 21%; 9 studies with pooled n=4648) for any anxiety disorders during pregnancy based on studies conducted outside of the UK where diagnostic clinical interviews were used, and 23% when using cut-offs on validated self-report questionnaires. 11 Anxiety disorders are treatable, so early detection and treatment during the antenatal period, when women are in regular contact with healthcare professionals could prevent adverse outcomes. [12][13][14] However, despite regular contact with healthcare professionals during pregnancy, antenatal mental disorders are often undetected and untreated. 15 16 strengths and limitations of this study ► This study investigates the effectiveness of the twoitem Generalised Anxiety Disorder scale (GAD-2) in identifying antenatal anxiety disorders-a method suggested by the National Institute for Health and Care Excellence (CG192 2014) on the basis of expert consensus rather than research evidence. ► We recruited a representative sample of women using an efficient and robust sampling design, which facilitated the estimation of population prevalences. ► We used a gold standard diagnostic interview. ► This research was limited to recruitment from a single maternity site in London, although the single South London maternity site used was a very ethnically and socioeconomically diverse population.

Open access
The National Institute for Health and Care Excellence (NICE) (CG192; 2014) suggested that maternity professionals could consider the use of the two-item GAD tool (GAD-2) to identify anxiety disorders during pregnancy and after birth, although also highlighted the lack of evidence on the use of the GAD-2 in early pregnancy. The recommendation was therefore driven by concern about the high prevalence of anxiety disorders (NICE 2014). 12 17 This extends the focus of early intervention from a previous emphasis on identification of perinatal depression to other perinatal mental disorders, including comorbid conditions. Outside of the perinatal period, a systematic review and meta-analysis 18 of studies in men and women reported that the GAD-2 (using a cut-off of ≥3) showed fairly high pooled sensitivity of 0.76 (95% CI 0.55 to 0.89), and pooled specificity of 0.81 (95% CI 0.60 to 0.92) for GAD (five studies with pooled n=1987). For detecting any anxiety disorder, the systematic review 18 reported a moderate sensitivity (range: 0.65-0.72) and unclear specificity (range: 0.39-0.92) (three studies, n=1225). The diagnostic accuracy of the GAD-2 questions in identifying anxiety disorders during early pregnancy remains unstudied. As women in early pregnancy are likely to have many anxieties (eg, over the viability of the pregnancy, decisional conflict over unplanned pregnancies), there may be high rates of 'false positives' when using the GAD-2 in pregnancy which could result in inappropriate referrals to mental health services.
We therefore aimed to investigate: 1. The UK prevalence of GAD, other anxiety disorders (including panic disorder, agoraphobia without panic disorder, social phobia, specific phobia, obsessive-compulsive disorder (OCD) and post-traumatic stress disorder (PTSD)) and comorbidity with other mental disorders during early pregnancy. 2. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR+) and negative likelihood ratio (LR−) of the GAD-2 screening questions (on a Likert scale) compared with a gold standard diagnostic interview (Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV)) 19 for identifying GAD, and for identifying any anxiety disorders (including panic disorder, agoraphobia without panic disorder, social phobia, specific phobia, OCD, PTSD and GAD) during early pregnancy. As DSM-V 20 no longer categorises OCD or PTSD as anxiety disorders, we also investigated diagnostic accuracy for any anxiety disorders by excluding OCD and PTSD. 3. How the sensitivity, specificity, PPV, NPV, LR+ and LR− change when the GAD-2 is scored and categorised as a yes (cut-off 1 or more) verses no (score of 0) response instead of the conventional cut-off of 3 or more (which could be asked by midwives at the same time as the two depression screening questions).

MethODs study design and participants
The WEll-being in pregNancy stuDY (WENDY study) was a cross-sectional survey that recruited women from an inner-city maternity service in South-east London using a sampling design stratified according to answering positive or negative (saying yes or no, respectively) on either of the two Whooley questions which are routinely asked by midwives as a mental health screen during the first antenatal booking appointment ("During the past month have you often been bothered by feeling down, depressed or hopeless?"; "During the past month have you often been bothered by having little interest or pleasure in doing things?"). A random sample of Whooley negative and all Whooley positive women were invited to participate. Exclusion criteria were women aged under 16 years, women who declined to answer Whooley questions, those who had a termination or miscarriage prior to baseline interview or had already attended for their maternity booking appointment elsewhere in the UK. Eligible pregnant women who agreed to participate were recruited into the study as soon as possible after their first antenatal booking appointment, within a maximum of 3 weeks from the original booking appointment. Data collection of the index test (GAD-2 measure) and reference test (the gold standard diagnostic interview) were performed during the research interview, after written informed consent was obtained. Language interpreters were used where needed. For further details and power calculation of the WENDY study, see Howard et al. 21 Sample size of the current analysis was determined by the number of women with available data on the index test (GAD-2). Figure 1 shows flow chart of women through the WENDY study and those used for the current analysis.
research measures GAD-2: This is a subscale of the Generalised Anxiety Disorder scale GAD-7 measure 22 and is a two-item self-report screen completed by women during the research interview. The questions include "Over the last 2 weeks, how often have you been bothered by any of the following problems? 1) Feeling nervous, anxious or on edge; 2) not being able to stop or control worrying". Answers are given on a Likert scale (not at all=0, several days=1, more than half the days=2 and nearly everyday=3). Scores range from 0 to 6, with a cut-off score of 3 or more indicative of anxiety symptoms. 23 The GAD-2 has also been used where YES to either question categorises an individual as a positive screen (used in clinical practice), 24 thus a score of 1 or more would indicate a possible YES answer. In this study, women's responses to the GAD-2 questions were categorised into the following groups: 1. GAD-2 (≥3): GAD-2 positive if they scored 3 or more (conventional scoring method), GAD-2 negatives if they scored <3. 2. GAD-2 (yes/no): GAD-2 positive if they scored 1 or more (indicating a 'yes' to either question), which is Open access used more in clinical practice, GAD-2 negative if scoring 0 (indicating 'no' to both questions).

Anxiety disorders and comorbid disorders
The Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-I): the SCID is a researcher-administered, semi-structured, and gold standard diagnostic interview consisting of standardised questions that correspond to each DSM-IV Axis I criteria. 19 We used the mood and anxiety disorders modules to generate diagnosis of depression, GAD, panic disorder, agoraphobia without panic disorder, social phobia, specific phobia, OCD and PTSD. Consensus on diagnosis was achieved during researcher's weekly meetings with LMH. Although clinical information and GAD-2 data were available at the time of these meetings, agreement on diagnosis was reached using responses given by women during the SCID interviews (ie, the GAD-2 items were not used to determine diagnosis of anxiety disorders).
As DSM-V 20 does not categorise OCD or PTSD as anxiety disorders and as the research version of the SCID for DSM-V was released after the start of the current study, we also carried out analyses for any anxiety disorders Figure 1 Flow chart of women through the WEll-being in pregNancy stuDY (WENDY) study during the study period (total recruited n=545) and women with available data on the two-item Generalised Anxiety Disorder scale (GAD-2) (n=528).

Open access
according to those included in DSM-V as anxiety disorders, that is, excluding OCD and PTSD. We also used the eating disorder module to generate diagnosis for eating disorders including anorexia nervosa (including atypical), bulimia nervosa, binge eating disorder, purging disorder and other eating disorder.

Patient involvement
The development of the WENDY study, outcome measures, grant application and study protocol were informed by our patient and carer advisory group. Meetings were held every few months to discuss the WENDY study and other related studies within a programme of work funded by the National Institute for Health Research (https://www. kcl. ac. uk/ ioppn/ depts/ hspr/ research/ CEPH/ wmh/ projects/ A-Z/ esmi. aspx). The patient advisory group includes women with a range of mental disorders and have interest in our study programme. We circulated the results and draft manuscript (see 'Acknowledgements' section) to members of the group for comments. Patients were not involved in the recruitment or conduct of the study.

statistical analysis
Data were managed and analysed using Stata V.15. 25 Sampling weights were used to adjust for the oversampling of Whooley positive women 26 in all analyses apart from examining differences in sociodemographics between GAD-2 positives and GAD-2 negatives. The population prevalence of anxiety disorders and comorbid disorders were estimated based on weighted diagnostic interview responses (using Stata's 'svy' command). Bootstrap resampling of the weighted estimators was used for calculation of CIs.
As prespecified in our analysis plan, the weighted rates of 'true positive', 'false positives', 'true negatives' and 'false negatives' were tabulated for GAD, any anxiety disorder (including PTSD and OCD) and any anxiety disorder (without PTSD or OCD). Using these values, the sensitivity, specificity, PPV, NPV and LRs (positive and negative) were calculated.

Missing data
Of the 545 participants, 24 (4%) women had some SCID missing data: one participant on GAD and all eating disorders: 1 participant on agoraphobia, specific phobia and PTSD; 1 participant on mixed anxiety and depression, borderline personality disorder and PTSD; 1 participant on hypomanic, manic, current major depressive disorder and bipolar I and II; 1 participant on all eating disorders and 19 participants on the PTSD module. List-wise deletion (performed in Stata) was used to calculate frequencies of SCID disorders in the study sample.
To calculate population prevalence of the SCID disorders, missing observations in the SCID items were accounted for by using inverse probability weights that incorporated the Whooley sampling, as well as variables that were significant in predicting missingness of SCID responses (for full details of weightings and analysis strategy, see Howard et al 26 ). Due to the large numbers of women who declined to respond to questions on the PTSD module, we carried out a sensitivity analysis for the prevalence estimate of PTSD in which we first assumed that all missing data were actually cases of PTSD, and then assumed that all missing data were not cases of PTSD.
Seventeen participants (3%) had missing GAD-2 data (15 for both questions, 1 for GAD-2 question 1 and 1 for GAD-2 question 2). No imputation was performed for women with missing data on GAD-2 items because missing data for one question would mean 50% data missing. These were therefore treated as missing observations and only women with complete data on both GAD-2 questions were used in the analyses to investigate sensitivity and specificity (n=528). Out of 528 women with complete data on the GAD-2 questions, there was missing data on the following SCID anxiety modules: 1 participant had missing data on PTSD, agoraphobia and specific phobia and 18 participants had missing data on the PTSD module.

ethics approval
The participants were provided with study information sheets, which was fully explained to them, had the opportunity to ask questions and gave informed consent prior to taking part in the study. No adverse events occurred for women taking part in the study during the research interview. Where risk (eg, significant suicidality, safeguarding issues) was identified during the research interview, researchers discussed this with the study PI and the woman's midwife and/or GP were informed, following consent to information sharing by the study participant (all women were aware this occurred and consented to this).

results sample characteristics
Between the dates of 10 November 2014 and 30 June 2016, 10 004 women attended their initial antenatal booking appointment with a midwife at the study site. Of these women, 41 did not have Whooley answers recorded so the base population consisted of 9963 women. The total number of eligible women recruited into the WENDY study was 545 women. This sample was similar to the base population on sociodemographic factors such as age, ethnicity and number of children. 21 Of the total number of women recruited into the WENDY study, 528 (97%) provided answers to the GAD-2 questions (figure 1). Sociodemographic characteristics (age, ethnicity and number of children) of women that provided GAD-2 answers were similar to the rest of the WENDY sample and wider base population (see online supplementary file 1). There were 119 (23%) GAD-2 (≥3) positives and 409 (77%) GAD-2 (<3) negatives within our study sample, where GAD-2 (≥3) positive was defined as reporting a total score of 3 or more on the GAD-2 questions (table 1). Compared with *Two participants had missing data on employment status (1 GAD-2 positive and 1 GAD-2 negative). †One hundred twenty participants had missing data on income (93 GAD-2 negatives, 27 GAD-2 positives).

Open access
GAD-2 negatives, GAD-2 positives were more likely to be single, have lower levels of education, be unemployed or not working due to illness, have a lower income and have an unplanned pregnancy. When scoring for GAD-2 positives was defined as scoring 1 or more, there were 226 (43%) GAD-2 (yes/no) positives and 302 (57%) GAD-2 (yes/no) negatives within our study sample.
sCID anxiety disorder prevalence and comorbidity Using weighted estimates, the population prevalence was estimated as 17% (95% CI 12% to 21%) for any SCID anxiety disorder (or 15% (95% CI 11% to 19%) for any SCID anxiety disorder excluding PTSD and OCD, as in DSM-V). Specifically, there was an estimated population prevalence of 5% (95% CI 3% to 6%) for GAD, 4% (95% CI 2% to 6%) for social phobia, 8% (95% CI 5% to 11%) for specific phobia, 0.2% (95% CI 0.03% to 0.3%) for panic disorder, 0.4% (95% CI 0% to 2%) for agoraphobia, 2% (95% CI 1% to 4%) for OCD and 0.8% (95% CI 0% to 1%) for PTSD. As missing data were particularly common for the PTSD module, a sensitivity analysis demonstrated that when all women with missing data are assumed to have PTSD, the population prevalence estimate was 4% (95% CI 2% to 6%). When all women with missing data are assumed not to have PTSD, the population prevalence estimate was 0.8% (95% CI 0.1% to 1.4%). Of note, of the women who declined to respond to the PTSD questions, eight had already disclosed severe trauma earlier during the research interview which were related to being physically abused, sexual abuse/rape and witnessing violence. The estimated population prevalence of comorbid depression with any anxiety disorder was 5% (95% CI 2% to 7%) (or 4% (95% CI 2% to 6%) for comorbid depression and any anxiety disorder excluding PTSD and OCD). Comorbid depression and GAD was estimated as 2% (95% CI 1% to 3%). Table 2 presents unweighted count (%) and weighted estimates of population prevalence (%) of comorbidity between SCID depression and all anxiety disorders (including PTSD and OCD).

DIsCussIOn
In this inner-city maternity population, the population prevalence estimated for GAD was 5% (95% CI 3% to 6%) and for all anxiety disorders was 17% (95% CI 12% to 21%), in line with other studies. 11 The population prevalence estimated for PTSD was lower (0.8%, 95% CI 0% to 1%) than the mean prevalence of 3.86% reported in a recent systematic review of PTSD identified by interviews during pregnancy. 27 However, as some women in our study who declined to answer the PTSD module also reported severe trauma elsewhere during the research Open access Open access interview, we conducted a sensitivity analysis where missing data were assumed first to be cases of PTSD and then in a second analysis not to be PTSD cases: the prevalence estimate of PTSD was then potentially as high as 4% (95% CI 2% to 6%). Thus, the low prevalence of PTSD in our main analysis may reflect barriers to disclosure of trauma and associated symptoms. As others have reported, 28 29 comorbidity is also common and we found estimates for comorbid depression and GAD (2%, 95% CI 1% to 3%) that were similar to previous estimates derived from representative samples and using diagnostic clinical interviews (GAD and depression 2% (95% CI 0% to 3%). 28 However, the population prevalence estimated in our sample of comorbid depression and any anxiety (5%, 95% CI 2% to 7%) was slightly lower than previous estimates (7%, 95% CI 3% to 11%). 28 This study makes a novel contribution to the gap in the literature by formally examining the diagnostic accuracy of the GAD-2 screening questionnaire for women during early pregnancy. The GAD-2 (using a cut-off of ≥3; as recommended by NICE) had a reasonable LR when used in early pregnancy for identifying GAD (LR+ 7, ie, above 5, which has been suggested to indicate a potentially useful tool in clinical practice). 30 However, GAD is not very common and the PPV is low (26%). Furthermore, the diagnostic accuracy of the GAD-2 for identifying other anxiety disorders was poor (LR+ 2.92). This evidence suggests that the GAD-2 is not a helpful tool for maternity services as it will generate many false positives.
There are several strengths to this study. First, we recruited a stratified representative sample of women using language interpreters to facilitate inclusion of non-English-speaking women. 21 Second, we used an efficient and robust sampling design which facilitated the estimation of population prevalences and we used a gold standard diagnostic interview. Limitations include recruitment from a single maternity site in London, although the single South London maternity site used in this study included a very ethnically and socioeconomically diverse population. There were some missing data (although this was rare other than for PTSD), and timing of administering the GAD-2 in early pregnancy may have overinflated the risk of 'false positives'.

Implications
This study does not support the NICE recommendation to use the GAD-2 in early pregnancy, due to its low PPV even for GAD (when applying a cut-off of both ≥3 or≥1 indicating a yes response) and low effectiveness for 'any anxiety disorder'. Its accuracy later in pregnancy warrants further study but a recent systematic review reported the prevalence of self-reported anxiety symptoms to increase through pregnancy (trimester 1: 18%, trimester 2: 19%, trimester 3: 25%) 11 ; diagnostic accuracy is therefore unlikely to improve. *We excluded PTSD and OCD to present the difference, as DSM-5 no longer considered PTSD and OCD as anxiety disorders. †Nineteen participants had missing data on the PTSD module. Seven of these participants met criteria for other anxiety disorders. Therefore, the total sample size for any anxiety disorders was 516. DSM-5, Diagnostic and Statistical Manual of Mental Disorders, fifth edition; GAD, Generalised Anxiety Disorder; LR, likelihood ratio; NPV, negative predictive value; OCD, obsessive-compulsive disorder; PPV, positive predictive value; PTSD, post-traumatic stress disorder.
Following the NICE CG192 guideline recommendations, some services have already implemented the GAD-2 in routine practice. The findings from the current paper challenge the use of expert consensus and extrapolation from evidence derived from the general population to pregnant women when making the NICE recommendations specifically for pregnant women. We argue that recommendations for pregnant women should be evidence based and propose that, at present, only the NICE recommendation on use of the Whooley questions is supported by evidence. 21 Currently, the evidence suggests that implementation of routine use of the GAD-2 is unwarranted. This brings us to a number of potential unexplored future research directions, which include a comprehensive diagnostic accuracy study, including data on acceptability, of the full GAD-7 questionnaire in pregnancy and the three 'anxiety' items in the Edinburgh Postnatal Depression Scale in pregnancy. For the GAD-7, currently there is no evidence. There is some conflicting evidence for the three 'anxiety items' in the Edinburgh Postnatal Depression Scale, but as others have suggested, research is needed to determine their validity, reliability and diagnostic accuracy as a measure of antenatal anxiety disorders. 31 32 Furthermore, use of other routinely collected data including the Whooley questions could be used in a model (using predictive modelling techniques) to identify anxiety disorders or perhaps even more importantly, 'any mental disorder'. Finally, further research is needed to examine ways to overcome the barriers to disclosure of trauma and trauma-related symptoms in pregnancy. This suggests that more sensitive methods need to be developed to ask women about trauma.

COnClusIOns
This study suggests that anxiety disorders are common in early pregnancy, but the GAD-2 screening measure is not useful during early pregnancy. In addition, even if asked directly, PTSD may be particularly difficult to detect in routine practice (as some women may decline to answer specific questions in relation to symptoms arising from traumatic experiences) and therefore further research is needed on how to overcome barriers to disclosure of trauma-related symptoms .
Author affiliations