Objective To validate the utilities of Berlin, STOP and STOP-BANG Questionnaires, other patient characteristics, comorbidities, Epworth Sleepiness Scale (ESS), fractional exhaled nitric oxide (FENO) and blood markers for the prediction of sleep disordered breathing (SDB) on limited polygraphy.
Setting North Glasgow Sleep Service (a tertiary referral centre).
Participants 129 consecutive patients, aged ≥16 years, referred to the sleep clinic for assessment of possible obstructive sleep apnoea.
Interventions We selected cut-points of apnoea hypopnoea index (AHI) of ≥5 and ≥15/h from their home polygraphy and determined associations of these with individual symptoms, questionnaire scores and other results. Receiver operating characteristic analysis and univariate and multivariate logistic regression were used to explore these.
Primary and secondary outcomes measures Primary: The utility of STOP, STOP-BANG and Berlin Questionnaires for prediction of SDB. Secondary: The utility of other measures for prediction of SDB.
Results AHI was ≥5 in 97 patients and ≥15 in 56 patients. STOP and STOP-BANG scores were associated with both AHI cut-points but results with ESS and Berlin Questionnaire scores were negative. STOP-BANG had a negative predictive value 1.00 (0.77–1.00) for an AHI ≥15 with a score ≥3 predicting AHI ≥5 with sensitivity 0.93 (95% CI 0.84 to 0.98) and accuracy 79%, while a score ≥6 predicted AHI ≥15 with specificity 0.78 (0.65 to 0.88) and accuracy 72%. Neck circumference ≥17 inch and presence of witnessed apnoeas were independent predictors of SDB.
Conclusions STOP and STOP-BANG Questionnaires have utility for the prediction of SDB in the sleep clinic population. Modification of the STOP-BANG Questionnaire merits further study in this and other patient groups.
- SLEEP MEDICINE
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Strengths and limitations of this study
This is the first study to prospectively evaluate the utility of the Berlin, STOP and STOP-BANG Questionnaires in the prediction of sleep disordered breathing in the population referred to a sleep service for assessment of possible obstructive sleep apnoea (OSA).
The results of this study show that the STOP and STOP-BANG, but not the Berlin Questionnaire, have utility for prediction of sleep disordered breathing in the sleep clinic population.
This study uses home unattended limited sleep studies rather than in-hospital attended full polysomnography; however, this is considered standard clinical practice in the UK and is considered an acceptable method for diagnosis of OSA by the American Academy of Sleep Medicine.
The sample size limits the conclusions that can be drawn from the multivariate analysis; however, this was a secondary objective of the study.
Obstructive sleep apnoea syndrome (OSAS) is common with prevalence of approximately 4% in middle-aged men and 2% in middle-aged women.1 Frequent partial (hypopnoea) or complete (apnoea) upper airway collapse during sleep leads to oxygen desaturation, increased respiratory effort, arousal and sleep fragmentation.2 Patients typically present with witnessed apnoeas, loud snoring and excessive daytime somnolence.3 The syndrome is associated with impaired quality of life,4 cognitive functioning and work performance,5 and with increased risk of road traffic accidents.6 OSAS is considered an independent risk factor for hypertension,7 and has associations with coronary disease, stroke, heart failure, arrhythmias,8 metabolic syndrome9 and type 2 diabetes.10
Despite the substantial burden of this disease, it is under-recognised. One study estimated that 93% of women and 82% of men with moderate-to-severe OSAS were not clinically diagnosed,11 and more recent data support this finding.12 Sleep studies are required for OSAS diagnosis but are expensive and not widely available.3 Given the recent increases in childhood13 and adulthood obesity,14 the workload for sleep clinics and sleep laboratories will increase. Predictors of sleep disordered breathing (SDB) are required to allow recognition of OSAS and prioritisation of investigations.
Several questionnaires have been designed to screen for SDB in different populations. The Berlin Questionnaire was first validated in primary care against portable unattended sleep studies and a ‘high risk’ score predicted a respiratory disturbance index >5 with sensitivity 0.86, specificity 0.77, positive predictive value (PPV) 0.89 and likelihood ratio 3.79.15 Its utilisation in other populations has been assessed with variable success.16–22 The STOP and STOP-BANG Questionnaires were originally validated in surgical patients using in-hospital attended polysomnography.23 For prediction of apnoea hypopnoea index (AHI) greater than 5, 15 and 30, sensitivities for the STOP and STOP-BANG Questionnaires were 65.6%, 74.3% and 79.5%, and 83.9%, 92.9% and 100%, respectively. The Berlin and STOP Questionnaires have been compared in a cohort of surgical patients24 and the STOP and STOP-BANG Questionnaires have been compared in a large study involving several distinct cardiovascular and respiratory disease cohorts.25 No study has, however, compared these screening tools in a sleep service-referred population. Finally, because of rising obesity rates, there is the potential for increasing recognition of SDB in primary care, and in the face of this evolution in sleep clinic practice it is therefore necessary to update and re-evaluate established assessment tools.
The objective of this study was, first, to compare the utility of Berlin, STOP and STOP-BANG Questionnaires for prediction of SDB in a population referred to the sleep clinic for assessment of possible OSA. Second, we sought to identify the most important variables from these questionnaires and routine sleep clinic assessment that might be utilised in the development of a composite predictive score for future use in this population.
This was a prospective observational study conducted during May–December 2012. Study participants received an information sheet and provided informed consent.
Consecutive patients aged ≥16 years referred to the North Glasgow Sleep Service (a tertiary centre) for assessment of possible OSA were invited to participate.
Height, weight, body mass index (BMI), neck circumference, blood pressure and Epworth Sleepiness Scale (ESS)26 were completed at the Sleep Clinic. Participants attended the Sleep Laboratory on a separate day so that a Sleep Physiologist could provide, and instruct on fitting, a sleep study device. On that occasion, relevant symptoms and comorbidities were recorded, Mallampati score was assessed and the Berlin and STOP-BANG Questionnaires were completed. Blood samples including a non-fasting lipid profile, glycated haemoglobin (HbA1c) and C reactive protein were taken. Two fractional exhaled nitric oxide (FENO) measurements were taken using the NIOX MINO (Aerocrine, Solna, Sweden), and the mean calculated.
Unattended home limited polygraphy sleep studies were performed using the SOMNOmedics SOMNOscreen kit (Randersacker , Germany) with channels that recorded body position, thoracoabdominal movements, oronasal airflow, heart rate, pulse oximetry and snoring. Sleep study scoring by experienced Sleep Physiologists was in accordance with accepted guidelines.27 An apnoea was defined as cessation of nasal flow for ≥10 s, while a hypnoea was defined as 50% reduction in nasal flow for ≥10 s, or lesser reduction in flow associated with oxygen desaturation of ≥4%.
The ESS, Berlin, STOP and STOP-BANG Questionnaires
The ESS is a validated measure of daytime sleepiness including eight questions, each with four possible responses, that assess the likelihood of dozing in different situations; a score of ≥11/24 denotes excessive daytime somnolence.26 The Berlin Questionnaire includes questions in three categories that relate, first, to snoring and witnessed apnoeas, second, to tiredness, fatigue and sleepiness, and third, to hypertension and obesity.15 High risk of OSA is defined by scoring positively in ≥2 categories. The STOP Questionnaire includes four yes/no questions that relate to Snoring, Tiredness, Observed apnoeas and high blood Pressure.23 High risk of OSA is defined as a score of ≥2. The STOP-BANG Questionnaire includes four additional questions relating to BMI, age, neck circumference and gender, and high risk of OSA is defined as a score of ≥3.23
Statistical analyses were carried out using GraphPad Prism 5, IBM SPSS Statistics V.19 and STATA V.12. Normality of data was checked using the D'Agostino & Pearson omnibus normality test. A priori, two cut-points were chosen for AHI ≥5 events/h (the standard cut-point for the diagnosis of OSA)28 and ≥15 events/h, to predict significant SDB (the standard cut-point for initiating continuous positive airway pressure (CPAP) therapy).28 Groups were compared using unpaired t tests, Mann-Whitney tests and Fisher's exact tests as appropriate. Sensitivities, specificities, PPV and negative predictive value (NPV), positive and negative likelihood ratios and overall accuracies were calculated for each of the questionnaires for prediction of SDB as defined by AHI cut-points ≥5 and ≥15. Associations between individual variables and each of the cut-points for AHI were explored using univariate and multivariate logistic regression. For multivariate analysis, in a few cases where BMI was known but neck circumference was not known, a value for the neck circumference was imputed using linear regression with BMI as the independent value. This allowed for a dataset of 116 cases with all of the variables known or imputed to be built to identify independent variables for inclusion in a composite score. Receiver operating characteristic (ROC) curve analysis was used to assess predictive value and an area under the curve (AUC) >0.7 was considered clinically significant. Data are presented as mean (SD), median (IQR) and proportion (percentage), unless stated otherwise. A p value <0.05 was considered statistically significant.
In total 150 participants participated in this study, of which 129 had adequate sleep study data and were included in the analysis. AHI was ≥5 in 97/129 (75%) and ≥15 in 56/129 (43%). Overall, 82 (64%) were male, mean (SD) age was 49 (11) years, and median (IQR) BMI was 32 (29–39) kg/m2.
Predicting SDB: patient characteristics
An AHI <5 (‘rule-out measurement’) was associated with female sex, younger age, lower weight and neck circumference, less frequently reported witnessed apnoeas, higher high-density lipoprotein (HDL) cholesterol, and lower triglycerides, cholesterol/HDL and HbA1c (see table 1). An AHI ≥15 (‘rule-in measurement’) was associated with male sex, obesity, higher weight, BMI and neck circumference, more frequently reported hypertension and witnessed apnoeas, lower HDL cholesterol, and higher triglycerides, cholesterol/HDL and HbA1c.
Predicting SDB: ESS, Berlin, STOP and STOP-BANG
The ESS and Berlin questionnaire outcomes were not associated with either AHI cut-point. An AHI <5 was associated with lower STOP and STOP-BANG scores and fewer participants being classified as ‘high risk’ for OSA by both STOP and STOP-BANG Questionnaires (see tables 2–⇓4). An AHI ≥15 was associated with higher STOP and STOP-BANG scores and more participants being classified as ‘high risk’ for OSA by the STOP-BANG Questionnaire but not by the STOP questionnaire.
For the AHI cut-point of ≥5, the Berlin, STOP and STOP-BANG Questionnaires had high sensitivities, moderate PPVs and poor specificities and NPVs for prediction of SDB. The STOP-BANG Questionnaire performed best with an overall accuracy of 79%. For the AHI cut-point of ≥15, the Berlin Questionnaire had high sensitivity but otherwise performed poorly. The STOP and STOP-BANG Questionnaires had high sensitivities and NPVs. Again, the STOP-BANG Questionnaire performed best, but with a low overall accuracy of 56%. The low negative likelihood ratios for the STOP and STOP-BANG Questionnaires at both cut-points indicate that these questionnaires have value in excluding disease. As shown in table 4, the cut-points for the STOP-BANG score that was associated with best overall accuracy were ≥3 and ≥6 for prediction of AHI ≥5 and ≥15, respectively.
SDB versus no SDB: predictors and a composite score
For the cut-point of AHI of ≥5, univariate logistic regression showed significant associations for age, gender, weight, neck circumference, witnessed apnoeas, triglycerides and cholesterol/HDL (p<0.05; see tables 5 and 6 and figure 1). For the cut-point of ≥15, significant associations were found for gender, weight, BMI, neck circumference, witnessed apnoeas, obesity, hypertension, FENO and cholesterol/HDL (p<0.05). Multivariate logistic regression based on the significant variables from univariate logistic regression showed that for both cut-points neck circumference and witnessed apnoeas were independent predictors of SDB. For the cut-point of AHI of ≥5, in a model incorporating neck circumference and witnessed apnoeas, the probability of SDB was 0.94 for individuals with neck circumference ≥17 inch and witnessed apnoeas (sensitivity 84%, overall accuracy 77%, ROC AUC 0.768, p<0.001). For the cut-point of AHI ≥15, the probability of SDB was 0.69 for individuals with neck circumference ≥17 inch and witnessed apnoeas (specificity 80%, overall accuracy 69%, ROC AUC 0.722, p<0.001).
This is the first study to prospectively evaluate the utility of the Berlin, STOP and STOP-BANG Questionnaires in prediction of SDB in a population referred to a tertiary sleep service for assessment of possible OSA. We found that in this population the Berlin Questionnaire had no significant association with cut-points of ≥5 or ≥15 for AHI, but that both the STOP and STOP-BANG scores were significantly associated with both cut-points. The STOP-BANG Questionnaire had better performance for the prediction of OSA on home sleep study, and different cut-points for the STOP-BANG score could be selected depending on the preference to exclude SDB (score <3) or predict SDB (score ≥6). In addition, we found notable associations between sleep study results and several patient characteristics. In particular, neck circumference and witnessed apnoeas were found to be independent predictors of SDB in our population.
In our study, the Berlin Questionnaire was almost ubiquitously positive (116 of 125 participants had a positive result) and the positivity rate did not differ between those with and without SDB. This was expected as this questionnaire was designed for primary care assessment and our study population consisted of individuals referred from primary care with symptoms suggestive of SDB. Our results indicate that the Berlin Questionnaire is not useful in the prediction of SDB in the sleep clinic referral population and this is consistent with previous reports.19 The high sensitivities obtained for both AHI cut-points support previous findings that the Berlin Questionnaire may have a role as a ‘rule-out’ measurement in the primary care or screening setting,15 ,17 ,20 ,24 though there have been some conflicting results, suggesting that it does not have adequate discriminatory power.16 ,22
In our study, ESS data indicated that two-thirds of participants had excessive daytime somnolence (ESS ≥11); however, scores were similar in individuals with or without SDB. Therefore, at least in the sleep clinic population, the ESS is not useful for the prediction of SDB. It may be of value perhaps if it is combined with other measures, including those highlighted in this study, in prediction of compliance with and benefit from OSA treatment. Further research is required to address this question. The exhaled nitric oxide levels were not significantly different between individuals with or without SDB, whether defined by an AHI cut-point of ≥5 or ≥15. There are conflicting data in the literature regarding whether FENO is associated with SDB;29–32 however, our results suggest that it does not have utility in prediction of SDB; further work is required to clarify this.
We found that the STOP and STOP-BANG Questionnaires have utility in the prediction of SDB in the sleep clinic population, and that STOP-BANG was superior, with higher overall predictive accuracy. The STOP and STOP-BANG Questionnaires were developed and validated in a surgical population using in-laboratory polysomnography23 and have subsequently been studied in a cardiovascular disease population.25 Our results are in agreement with these two earlier studies as regards the increased predictive value of STOP-BANG over STOP. In contrast to these earlier studies, however, we found sensitivities to be higher and specificities to be lower for both cut-points of AHI. We suggest that of the two AHI cut-points, ≥15 events/h is the more important, being diagnostic of at least moderate SDB and also an indication for CPAP treatment. At this cut-point, STOP and STOP-BANG performed with high sensitivities and negative predictive values (STOP-BANG being superior to STOP), indicating that these questionnaires are more useful in excluding significant SDB. This is further corroborated by the negative likelihood ratios of <0.2 obtained for STOP and STOP-BANG that also indicate that these questionnaires are most useful in ruling out SDB. STOP-BANG may be of value in the primary care setting, perhaps if combined with type IV portable monitoring sleep studies, to determine requirement for sleep clinic review and more detailed polygraphy.
At the AHI cut-point of ≥15, STOP-BANG had sensitivity and NPV of 100%, and since this is the standard cut-point conventionally used to determine need for CPAP,28 we suggest that the STOP-BANG Questionnaire is the preferred tool for prediction of SDB in the sleep clinic setting of those currently available. STOP-BANG, perhaps with modifications, merits further evaluation for the prediction of SDB in the sleep clinic population and, more importantly, its utility in prediction of clinical outcomes including treatment success should be assessed.
The original STOP-BANG Questionnaire uses a cut-point of ≥3 to predict SDB.23 However, in our study, we show that different cut-points can be selected depending on the preference to rule-in or rule-out SDB. A score of ≥3 had the highest overall accuracy and a sensitivity of 0.93 for the AHI cut-point of ≥5, whereas a score of ≥6 had the highest overall accuracy and a specificity of 0.78 for the AHI cut-point of ≥15. Two other studies have examined the usefulness of different cut-points for the STOP-BANG score.33 ,34 In the obese, a score of ≥3 was associated with a sensitivity of 0.90 for predicting an AHI >5, while a score of ≥6 had a specificity of 0.88 for predicting an AHI >15 and similar results have been obtained in the morbidly obese33 and in a surgical population.34 Thus, in the sleep clinic setting where the ultimate goal is to identify patients requiring CPAP, a higher cut-point for STOP-BANG may be preferred, whereas in a primary care setting where the priority is not to miss disease, a lower cut-point may be chosen.
The STOP-BANG Questionnaire is, however, still an imperfect tool for prediction of results on home polygraphy. Accordingly, the secondary objective of our study was to identify variables for inclusion in a locally developed composite score for future validation in the sleep clinic and potentially wider population. Univariate analysis showed several significant, expected associations for both cut-points of AHI. Using multivariate analysis, neck circumference ≥17 inch and the presence of witnessed apnoeas were independent predictors of SDB. This is not a novel finding, but it does support the robustness of our data. Particularly when SDB was defined by an AHI cut-point of ≥5, the regression model derived indicated a high probability of SDB of 0.94 if both factors were present. The STOP-BANG Questionnaire, of course, includes both these variables, and it is possible that adjustment of the inclusion variables, or their weighting, might improve its performance. In future work, we aim to validate a simple composite score based on these two variables in a modification of STOP-BANG, to determine utility for predicting sleep study data and outcomes with treatment.
Ultimately, a predictive tool that can be utilised in primary care is the goal. Our results indicate low specificity of STOP-BANG, and therefore in its current form, if used in primary care to identify patients requiring referral for further assessment, it is likely to result in a significant percentage of patients being referred unnecessarily (false positives). It is hoped that a modified STOP-BANG with improved specificity, while not compromising sensitivity, may be developed that can be used safely in primary care for identification of patients requiring referral to sleep services. Of upmost importance too is the prediction of treatment outcome. Non-adherence to CPAP treatment occurs between 46% and 83%.35 ,36 Prediction of poor adherence by STOP-BANG or other similar tools would allow greater attention to interventions to improve adherence in patients more likely to default from treatment. The authors are not aware of any studies investigating this question and future research should explore this important issue.
A possible limitation of our study was that SDB was characterised using home unattended limited sleep studies rather than in-hospital attended full polysomnography. The latter is considered the gold standard for diagnosis of SDB but is more expensive, less easily accessed and potentially unrepresentative with sleep in an unfamiliar environment. Home unattended and in-hospital attended sleep studies have previously been shown to produce similar results.37 Accordingly, home testing with portable monitors is standard clinical practice in the UK, and is now considered an acceptable method for diagnosis of OSA by the American Academy of Sleep Medicine.28 The sample size limits the conclusions that can be drawn from multivariate analysis; however, this was a secondary objective of the current study. It is possible that variables predictive of SDB on univariate analysis in this cohort would have been identified as independently predictive in multivariate models in a larger population. The results of this study allow us, and potentially others, to focus future work to validate more extensively the results obtained until now. We chose AHI cut-points of ≥5 and ≥15 to define significant SDB. This was based on the consensus guideline produced by the Adult Obstructive Sleep Apnea Task Force of the American Academy of Sleep Medicine that states that diagnosis of OSA is based on a cut-point of >15 events/h or >5 events/h with relevant symptoms, and that CPAP is indicated for treatment of moderate-to-severe OSA with ≥15 events/h.28 Although the cut-point of >30 events/h is consistent with severe OSA, we suggest that this cut-point is less relevant clinically from a diagnostic perspective or from that of determining treatment. Finally, owing to the prospective design of our study, we cannot comment on the relative value of other tools developed for prediction of OSA such as the Sleep Apnea Clinical Score38 and American Society of Anesthesiologists Checklist.39 To compare their utility with that of the Berlin, STOP and STOP-BANG Questionnaires in the population referred to, the sleep service would require a further study.
In conclusion, the Berlin Questionnaire was not useful in the prediction of SDB within our sleep clinic population. The STOP-BANG Questionnaire had superior predictive performance to the STOP Questionnaire at both cut-points of AHI (≥5 and ≥15). A STOP-BANG score of ≥3 had the highest overall accuracy and a sensitivity of 0.93 for the prediction of an AHI ≥5, while a score of ≥6 had the highest overall accuracy and a specificity of 0.78 for the prediction of an AHI ≥15. Future work will validate a composite score including neck circumference ≥17 inch and the presence of witnessed apnoeas for the prediction of SDB in the sleep clinic referral population. An optimised composite score could then be evaluated in primary care and against treatment outcomes, with our overall aim being to provide required tools for use in the expanded and consolidated sleep services that are now necessary given the current obesity and OSA epidemics.
Contributors DCC took a leading role in study protocol development, study document development, application for ethics approval, data collection, statistical analysis and paper writing. GA provided statistical support and performed part of the statistical analysis. DM, DR and HA contributed to data collection, and carried out and scored sleep studies. SB contributed to study protocol development. EL contributed to study protocol development, study document development, application for ethics approval and paper writing. CC contributed to study protocol development, data collection, statistical analysis and paper writing. All authors approved the final draft before submission. CC is responsible for the overall content as guarantor.
Funding This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None.
Ethics approval This study was approved by the West of Scotland Regional Ethics Committee.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.