Objectives Dyspnoea and chest pain are symptoms shared with multiple pathologies ranging from the benign to life-threatening diseases. A Gut Feelings Questionnaire (GFQ) has been validated to measure the general practitioner’s (GPs) sense of alarm or sense of reassurance. The aim of the study was to estimate the diagnostic test accuracy of GPs’ sense of alarm when confronted with dyspnoea and chest pain.
Design and settings Prospective observational study in general practice.
Participants Patients aged between 18 and 80 years, consulting their GP for dyspnoea and/or chest pain, were considered for enrolment. These GPs had to complete the GFQ immediately after the consultation.
Primary outcome measures Life-threatening and non-life-threatening diseases have previously been defined according to the pathologies or symptoms in the International Classification of Primary Care (ICPC)-2 classification. The index test was the sense of alarm and the reference standard was the final diagnosis at 4 weeks.
Results 25 GPs filled in 235 GFQ questionnaires. The positive likelihood ratio for the sense of alarm was 2.12 (95% CI 1.49 to 2.82), the negative likelihood ratio was 0.55 (95% CI 0.37 to 0.77).
Conclusions Where the physician experienced a sense of alarm when a patient consulted him/her for dyspnoea and/or chest pain, the post-test odds that this patient had, in fact, a life-threatening disease was about twice as high as the pretest odds.
Trial registration number NCT02932982.
- gut feelings
- chest pain
- general practitioner
- decision making
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Strengths and limitations of this study
This is the first study which estimates the accuracy of general practitioners’ (GPs) sense of alarm when faced with dyspnoea and thoracic pain.
The prospective design and the use of a validated questionnaire to determine gut feelings in real primary care settings are two major assets of this study.
A main limitation was the low number of participating GPs despite several strategies for stimulating GPs’ participation.
Making the right decision when faced with dyspnoea and chest pain is an almost daily challenge for general practitioners (GPs). GPs have to reassure the patient when there is a musculoskeletal explanation or another non-serious disease but have to refer the patient to the emergency department when it is a serious cardiac or pulmonary disease. GPs consider all these diagnoses when confronted with the often non-specific symptoms of dyspnoea and chest pain.1 Dyspnoea represents a reason for consultation in 0.9%–2.6% of cases in primary care,2–6 and is even the fourth reason for consultation by older patients.7 Chest pain represents from 0.7% to 2.7% of patient consultations in general practice.3 6 8 9 Non-life-threatening diseases, such as chest wall syndrome, were diagnosed in 24.5%–53% cases of chest pain,8–10 whereas life-threatening conditions, including cardiac or respiratory diseases and cancer, accounted for 20% of cases.8
Many diagnostic tools have been designed to guide GPs through the diagnostic reasoning process for chest pain and dypsnoea.11 12 But, in fact, before using any prediction rule for a specific diagnosis, a GP should have some suspicion of a specific diagnosis to select the appropriate prediction rule.13 Medical decision making and problem solving involve many sources of knowledge such as medical knowledge, knowledge of patients and expertise. Gut feelings, based on these three knowledge sources, contribute to the diagnostic process.14 15 Earlier research showed how GPs’ suspicion of pulmonary embolism (PE) arises. The most important determinants were the absence of indicative clinical signs for diagnoses other than PE, a sudden change in the condition of the patient, an earlier failure to diagnose PE and a gut feeling that something was seriously wrong.16 The sense of alarm means that ‘a GP worries about a patient’s health status, even though he/she has found no specific indications yet’.17 Its counterpart is the sense of reassurance, meaning that a GP feels secure about the further management and course of a patient’s problem, even though he/she may not be certain about the diagnosis: everything fits in. Gut feelings act as a compass when faced with uncertainty, and feature alongside medical decision making and problem solving as a third track in diagnostic reasoning.18 However, the accuracy of GPs’ sense of alarm when confronted with dyspnoea and/or chest pain is not known.
A Dutch Gut Feelings Questionnaire (GFQ) is available, created by using the results of a consensual process to provide descriptions of gut feelings. The GFQ was validated by a construct validation procedure using case vignettes.19 The internal consistency of the GFQ proved to be high (Cronbach’s alpha=0.91); the kappa with quadratic weighting was moderate to good (0.62, 95% CI 0.55 to 0.69). A principal component analysis confirmed one factor, with the sense of reassurance and the sense of alarm items as two opposites, explaining 70.2% of total variance.19 Linguistic validation procedures were performed to obtain an English version of the questionnaire and, subsequently, a French version.20 Finally, after a two-step study and several minor adaptations, the definitive version of the GFQ was proven to be a feasible and practical tool to be used for prospective observational studies in daily practice21 (The GFQ is available in online supplementary appendix 1). The GFQ enabled us to calculate the diagnostic test accuracy of the sense of alarm when applied to dyspnoea and/or chest pain.
We conducted a prospective, observational study using the French version of the GFQ. The protocol of the study has been published.22
Patients aged between 18 and 80 years, consulting their GPs for dyspnoea and/or chest pain were considered for enrolment in the study. Consecutive patients, consulting their GP because of dyspnoea and/or chest pain, were enrolled over a period of 14 months. Dyspnoea was defined as difficult or laboured breathing (Medical Subject Headings (MeSH) definition). Chest pain is defined as pressure, burning or numbness in the chest (MeSH definition).
Non-inclusion criteria were: patients in palliative care, and patients known to have coronary heart disease. Patients known to have pulmonary diseases were not excluded because of the possible coexistence of a life-threatening event (eg, embolism, secondary infection) along with other pulmonary pathologies such as COPD.
GPs in the General Practice Department of Brest University as traineeship supervisors were informed of the study by personal email. Personal phone calls and presentations during academic meetings promoted the study. The GPs taking part received a videotaped presentation with an explanation of the objectives and the study design. The GPs had to fill in the GFQ straight after each consultation where dyspnoea and/or thoracic pain appeared to be the reason for the consultation. We emphasised the importance of including consecutive patients in order to minimise a selection bias.
The sample size was estimated before the start of the inclusion. The prevalence of consultations for dyspnoea in France in primary care is 1.77% and 1.51% for thoracic pain.3 In a Dutch study, which aimed at estimating the incidence of gut feelings in general practice consultations, the sense of alarm was present in 8.2% of the consultations, the incidence was 7% for the respiratory International Classification of Primary Care (ICPC) code chapter and 15% for the circulatory ICPC code chapter.23 We defined our initial population as 40 volunteer GPs, each following up, on average, 800 patients in their practices. We included a physician level and a patient level in our calculation. The number of cases required for a power of 80% and a type 1 error rate of 5% was 211 for thoracic pain with 34 GPs and 123 for dyspnoea with 31 GPs. Taking into account Lasagna’s law, we estimated 7 cases of chest pain per GP and 4 cases of dyspnoea per GP, with a population size of 40 GPs.24 The participants received €220 as an incentive after completing the 11 questionnaires.
Personal emails were sent to the participating GPs to inform them of how many cases of dyspnoea and/or chest pain were already included and how many still remained to be included. GPs who did not include any patients received a personal phone call.
Patient and public involvement
It was not appropriate or possible to involve patients or the public in the design, or conduct, or reporting, or dissemination of our research.
The index test was the sense of alarm felt by the GPs and determined by the GFQ, consisting of 11 items. Items 2–7 in the questionnaire are derived from the consensus statements from the gut feelings concept which describes the sense of reassurance (item 2) and the sense of alarm (items 3–7). The items are rated using a 5-point Likert scale, ranging from completely disagree to completely agree. Items numbered, 8–10 relate to the diagnostic workout. The 1st and 11th items assess whether the patient’s case elicits a gut feeling (a sense of reassurance or a sense of alarm) or whether it is not applicable. A sense of alarm is considered as present when the answer to item 1 or 11 indicate a sense of alarm; it is considered not applicable if at least one of the scores of items 3–6 is higher than 3/5. A sense of alarm is considered as not present when the answer to items 1 or 11 indicates a sense of reassurance; or when it indicates that it is not applicable and none of the scores of items 3–6 is higher than 3/5. The final diagnoses were not available to the readers of the GFQ at this stage.
The reference standard was the final diagnosis collected 4 weeks later, by phone contact with the GP or by asking the GP to phone the patient if no information had been received from the hospital or specialists at that point. During a previous step in this study, pathologies or symptoms in the ICPC-2 classification linked to dyspnoea and chest pain were classified into three categories: life threatening (18 conditions), non-life-threatening diseases (11 conditions) and pathologies where the severity depended on clinical features and context (table 1). This document was formulated, following a nominal group procedure, by a group of seven academic GPs. Members of the research team, blinded to the outcomes of the GFQ, individually judged each case of the study in turn. They classified the outcomes in the three categories and did, by consensus, classify the expected outcomes in the third category (pathologies where the severity depends on clinical features and context) as either life-threatening or non-life-threatening diseases. Hospitalisation because of serious conditions was among the clinical features.
The scores of the GFQ were assessed by two independent researchers blinded to the final diagnosis. A two-way contingency table was used to evaluate the diagnostic accuracy. The rows indicate the presence or absence of a life-threatening pathology (related to dyspnoea and chest pain) and columns the presence or absence of the sense of alarm. To estimate accuracy heterogeneity between GPs, a binomial random effects model was used.25 Sensitivity, specificity, positive and negative likelihood ratios (LR+, LR−) were then calculated.
A written informed consent form was signed and dated by GPs and patients at the beginning of the study.
Sixty-four GPs volunteered to take part in the study and 25 (15 male and 10 female) actually recruited patients. These GPs were young, (13 between 30 and 40 years of age, 7 between 40 and 60, and 5 above 60), and worked in rural (n=13) or urban areas (n=12). Eighteen of them filled in 11 questionnaires (figure 1). Patients were enrolled from the 1 November 2016 to 31 December 2017. In total, 241 questionnaires were collected: 4 were non-analysable due to missing data; 235 determined a sense of alarm (82 cases (35%)) or a sense of reassurance (153 cases (65%)) and were included in the analysis for diagnostic accuracy; 2 determined neither a sense of alarm nor a sense of reassurance.
Dyspnoea was the reason for consultation in 88 cases and chest pain in 147 cases. After 4 weeks, 187 cases had non-life-threatening pathologies (79.6 %), and 48 cases had life-threatening pathologies (20.4 %). In our study, no patient died.
The most frequent final diagnoses after 4 weeks were: 69 musculoskeletal pain (29 %), 18 pneumonia (7.7 %), 13 asthma (5.5 %), 11 heart failure (4.7 %), 11 acute myocardial infarction (4.7 %), 3 ischaemic heart disease with angina (1.3 %), 5 atrial fibrillation (2.1 %), 3 pericarditis (1.3 %), 3 oesophageal disease (1.3 %), 10 acute upper respiratory infection (4.3 %), 16 Chronic Obstructive Pulmonary Disease (COPD) exacerbation (6.8 %), 10 gastro-oesophageal reflux (4.3 %), 16 anxiety disorder (6.8 %), 3 PE (1.3 %), 29 cases for which the pain or the dyspnoea disappeared without final diagnosis after tests (12.3 %) and 15 with other diagnoses (6.6 %).
The Principal Component Analysis (PCA) confirmed unidimensionality, with one component explaining 64.13% of the total variance. The internal consistency was high (Cronbach’s alpha=0.887). The SD of the observed accuracies for each 25 GPs was 0.206 (crude analysis). The binomial random effects model estimated the average accuracy at 0.69 and the accuracies for each GPs between 0.673 and 0.706. The SD of estimated accuracies was 0.008 (weighted analysis) (table 2). Thus, there was a low accuracy heterogeneity between GPs in our sample.
Table 3 shows the two-way contingency while table 4 provides the calculated sensitivity, specificity, positive and negative LRs, positive and negative predictive values, disease prevalence and accuracy (along with their 95% CIs). The LR+was 2.12 (95% CI 1.49 to 2.82). The LR− was 0.55 (95% CI 0.37 to 0.77)). The sensitivity of the sense of alarm was 0.61 (95% CI 0.48 to 0.73) and the specificity was 0.71 (95% CI 0.68 to 0.75). These results mean that, according to the Bayes theorem, the post-test odd for a patient having a life-threatening disease was about twice as high as the pre-test odd when the sense of alarm was present. Using the Fagan nomogram related to the Bayes theorem, when the initial estimated probability was 20% (table 4), the patient had a 35% probability of having a life-threatening disease; not experiencing a sense of alarm decreased the probability of having a life-threatening pathology from 20% to 12%.
We also analysed separately the accuracy of the questionnaire used to determine the sense of alarm for diagnosis of health threatening pathology in patients with dyspnoea (table 5) and thoracic pain (table 6). The dyspnoea LR+=2.862 (95% CI 1.56 to 4.791) was higher than for chest pain LR+=1.820 (95% CI 1.163 to 2.625). The difference was non-significant (p=0.34).
Summary of findings
This prospective study showed that the sense of alarm might support GPs’ diagnostic process when they are faced with patients who have dyspnoea and/or chest pain. When the GP had a sense of alarm, the probability that the patient had a life-threatening disease increased from 20% to 35%. When the GP did not experience a sense of alarm, the probability of the patient having a life-threatening pathology decreased from 20% to 12%.
Strengths and limitations of the study
This is the first study which estimates the accuracy of GP’s sense of alarm when faced with dyspnoea and thoracic pain. The prospective design and the use of a validated questionnaire to determine gut feelings in real primary care settings are two major assets of this study.
A main limitation was the low number of participating GPs. Despite several personalised emails with positive motivational invitations, less than half of the 64 GPs who agreed to participate included patients. We focused on this crucial step of GP recruitment by using personal contacts with physicians and targeting the friendship networks26 and we multiplied the ways of informing and presenting the study by using different media. We sent personal emails to inform GPs how many questionnaires the participants had already sent and several personal emails, with positive and encouraging messages, to stimulate those who had not yet included any patients. A financial incentive was presented as a compensation for the time spent on the questionnaires. Unfortunately, despite all efforts, only 25 GPs participated in the study, filling in 235 analysable questionnaires. We had both male and female GPs, young and more experienced GPs, in both rural and urban area: the principal characteristics of GPs were represented in our GP participants sample. Despite the high heterogeneity in GPs’ demographics, the random effects analysis found low accuracy heterogeneity between the GPs in our sample: the SD of the GP’s accuracies was 0.008. Reasons for non-inclusion were a lack of time, and that chest pain and dyspnoea were considered too infrequent reasons for a consultation. Young GPs (30–40 years) were over-represented in our sample of working GPs. Participating in research projects is very uncommon among French GPs. Including patients and filling in questionnaires represent additional and unusual work. Another barrier is that they do not like to be observed when making decisions in the uncertain and complex world of general practice.27 The younger generation of French junior lecturers have developed research skills and regularly publish in peer-review journals.28 Further research with a larger sample of GPs should be conducted.
A selection bias might have occurred in the recruitment of cases. Checking that the cases were included consecutively, remained indeed difficult, although we stressed its importance to the participating GPs. The second problem might be a selection bias during inclusion. GPs might have focused on chest pain as presenting an acute coronary syndrome. They might have included the most salient patient instead of consultations seen as more common. We minimised this bias by stressing the definition of dyspnoea and chest pain in each contact. We also trained the GPs to include all cases meeting the inclusion criteria. Inclusion and non-inclusion criteria figured on the back of each questionnaire.
We have focused on the diagnostic value of the sense of alarm independently of other relevant diagnostic variables such as age or comorbidities. A question about triggers of gut feelings was not included in the questionnaire. Studying the added value of these variables to the sense of alarm is a next step in our research.
Comparison with existing literature
Several studies were about the predictive value of gut feelings in the area of serious infection in children29; sepsis in primary care30; children with respiratory tract infection in general practice31; use of gestalt with regard to 32PE and the role of intuition in the suspicion of cancer.33 34 All these studies used a binary question ‘do you have gut feelings?’29–31 without using a proper definition of the concept. Therefore, it is unknown how the participants had interpreted the term ‘gut feelings’ or ‘intuition’? The GFQ was created following the consensus-based definitions of the sense of alarm and the sense of reassurance.19 We kept all items within the GFQ to guarantee a validated, standardised tool through different languages and cultures.
The topics of chest pain and dyspnoea in primary care were chosen because of the prevalence of these symptoms and the diagnostic challenge. The clinical presentation of chest pain is not very discriminating35 and an evaluation based on symptoms and signs alone may not be accurate in diagnosing or excluding coronary heart disease.36 The most used point-of-care tests were the d-dimer test and troponin because of the difficulty in making the correct diagnosis when faced with dyspnoea and chest pain.37 Several prediction rules (eg, the Marburg Heart Score) exist to help the GPs to make decision when a patient is suffering from dyspnoea and/or thoracic pain. However, before using any of these score oriented towards particular diagnosis, the GP should have some suspicion of a specific disease, for example, triggered by a gut feeling. In the case of dyspnoea and/or thoracic pain, this initial stage was unclear and should first be carefully studied. The sense of alarm was found here to have lower sensitivity than specificity which is contrary to other clinical rules for which sensitivity is higher than specificity. The GP participants had no instructions on their decision making, particularly not on the use of a prediction rule
The diagnostic approach to chest pain in general practice differs from the one in emergency departments and even from what is recommended in textbooks.14 15 38 A GP knows his/her patient and may observe a change in appearance or behaviour which will be weighed alongside chest pain, for instance.36 In addition, the pretest probability of diseases in general practice is different when compared with the hospital specialist context. Thompson et al described how the three classic symptoms of meningococcal disease in children and adolescents: rash, meningism and impaired consciousness occur very late in the disease history. Leg pain, cold hands and feet, and abnormal skin colour may be the only symptoms observed at a first consultation with a GP.39 Textbooks describe pictures of fully developed diseases. GPs often face the very early manifestations of a disease or a clinical presentation that does not correspond to a disease which is well described in textbooks and yet may still imply a serious pathology.40 In this specific situation, an LR +2.2 might be of more appropriate diagnostic value for GPs than for hospital specialists. The implication in clinical decision making is obvious. A sense of alarm might help a GP to avoid diagnostic errors because it activates the diagnostic process, weighing up working hypotheses that might involve a serious outcome. He/she might initiate specific management to prevent serious health problems. The mismatch between the current situation and known patient or disease patterns triggers the sense of alarm ‘something does not fit here’.18 41 The perception of alarm compels the physician to quit his routine-based reasoning and switch to analytical reasoning.42 The sense of alarm acts as a feedback mechanism at a very early stage in the diagnostic process, allowing the questioning of a possibly wrong direction of reasoning.
The sense of alarm might be a useful tool for GPs when facing the unspecific signs and symptoms of diseases related to dyspnoea and chest pain. We need other studies on the topic to conclude on the accuracy of the sense of alarm with a larger sample of GPs including the use of prediction rules.
Contributors MB, EF, AD, TM, PVR and ES conceived the study, participated in its design and coordination and helped to draft the manuscript. AD performed the statistical analysis. All authors read and approved the final manuscript.
Funding This work was supported by the Agence Régionale de Santé ARS Bretagne.
Disclaimer This funding source had no role in the design of this study and will not have any role during its execution, analyses, interpretation of the data, or decision to submit results.
Competing interests None declared.
Patient consent for publication Not required.
Ethics approval This study was approved by the ethical committee of the University de Bretagne Occidentale.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement All data relevant to the study are included in the article or uploaded as supplementary information.