Article Text

Download PDFPDF

Original research
Diagnostic accuracy of heart auscultation for detecting valve disease: a systematic review
  1. Anne Herefoss Davidsen1,
  2. Stian Andersen1,
  3. Peder Andreas Halvorsen1,
  4. Henrik Schirmer2,3,
  5. Eirik Reierth4,
  6. Hasse Melbye1
  1. 1General Practice Research Unit, Department of Community Medicine, UiT The Arctic University, Tromso, Norway
  2. 2Department of Clinical Medicine, University of Oslo Faculty of Medicine, Lørenskog, Norway
  3. 3Department of Cardiology, Akershus University Hospital, Lorenskog, Norway
  4. 4Science and Health Library, UiT The Arctic University, Tromso, Troms, Norway
  1. Correspondence to Ms Anne Herefoss Davidsen; anne.h.davidsen{at}uit.no

Abstract

Objective The objective of this study was to determine the diagnostic accuracy in detecting valvular heart disease (VHD) by heart auscultation, performed by medical doctors.

Design/methods A systematic literature search for diagnostic studies comparing heart auscultation to echocardiography or angiography, to evaluate VHD in adults, was performed in MEDLINE (1947–November 2021) and EMBASE (1947–November 2021). Two reviewers screened all references by title and abstract, to select studies to be included. Disagreements were resolved by consensus meetings. Reference lists of included studies were also screened. The results are presented as a narrative synthesis, and risk of bias was assessed using Quality Assessment of Diagnostic Accuracy Studies-2.

Main outcome measures Sensitivity, specificity and likelihood ratios (LRs).

Results We found 23 articles meeting the inclusion criteria. Auscultation was compared with full echocardiography in 15 of the articles; pulsed Doppler was used as reference standard in 2 articles, while aortography and ventriculography was used in 5 articles. One article used point-of-care ultrasound. The articles were published from year 1967 to 2021. Sensitivity of auscultation ranged from 30% to 100%, and specificity ranged from 28% to 100%. LRs ranged from 1.35 to 26. Most of the included studies used cardiologists or internal medicine residents or specialists as auscultators, whereas two used general practitioners and two studied several different auscultators.

Conclusion Sensitivity, specificity and LRs of auscultation varied considerably across the different studies. There is a sparsity of data from general practice, where auscultation of the heart is usually one of the main methods for detecting VHD. Based on this review, the diagnostic utility of auscultation is unclear and medical doctors should not rely too much on auscultation alone. More research is needed on how auscultation, together with other clinical findings and history, can be used to distinguish patients with VHD.

PROSPERO registration number CRD42018091675.

  • Valvular heart disease
  • Adult cardiology
  • PRIMARY CARE
  • MEDICAL EDUCATION & TRAINING

Data availability statement

Data are available on reasonable request. Not applicable.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

STRENGTHS AND LIMITATIONS OF THIS STUDY

  • Strengths of this systematic review include a broad search, explicit eligibility criteria, screening of studies in duplicate and quality assessment of the studies.

  • The review is limited by a non-comparative design of the included studies.

  • Half of the studies are at risk of selection bias, either because of a non-consecutive inclusion or because the method of inclusion was not fully described.

  • Most of the studies included patients already hospitalised or referred for echo.

Background

Since the 19th century, the stethoscope has been a low cost, accessible tool for diagnosing valvular heart disease (VHD). VHD includes stenosis and regurgitation of all the valves of the heart. A multinational survey done in Europe by the Euro Heart Survey program in 2001 concluded that VHD now is mostly degenerative in origin.1 In this hospital-based survey, aortic stenosis (AS) was the most frequent of the VHDs (43%), followed by mitral regurgitation (MR; 32%) and aortic regurgitation (AR; 13%).1 In a population-based study from the USA done in 2003 (n=11 911), the national prevalence of moderate or severe VHD was 2.5%, adjusted for age and sex.2 In this study, MR was the most prevalent VHD (1.7%), followed by AR (0.5%) and AS (0.4%). The adjusted relative risk of death related to VHD was found to be 1.36. In this review, we focus on AS, AR and MR, as other clinically significant VHDs are rare in developed countries today.3

Findings on auscultation related to VHD mainly have a focus on murmurs. Systolic murmurs are usually graded by using the Levine grading scale, from I to VI, where grade I is the faintest and often not heard until the examiner has listened several cycles, and grade VI can be heard with the stethoscope not even touching the chest wall.4 5 Grades I and II are often regarded as innocent murmurs while grades III–VI are regarded as haemodynamically significant.5 6 This is, however, not always the case, as a soft systolic murmur is still likely to indicate severe valve disease if the patient is in a low output state clinically or, in the case of even asymptomatic AS, if the second sound is soft or absent.7 Diastolic murmurs are usually graded from 1 to 4, and as a general rule regarded as pathological independent of the grade of intensity.8 In addition to murmur intensity, several clinical findings can help differentiate between murmurs: Site of maximum intensity, change in intensity with position, respiration or Valsalva’s manoeuvre, the length of the murmur, presence of the second heart sound, splitting of the heart sounds and more.6 7 To determine diagnostic accuracy of auscultation, different reference standards have been used over time, from findings during open heart surgery and angiography to transthoracic echocardiography.

Since VHD is likely to become a larger public health issue in the future as the ageing population grows,9 it is important to determine medical doctors auscultation proficiency. A study from 2018 used a case-based multiple-choice questionnaire to assess primary care physicians’ approach to murmurs and concluded that there was an ‘underuse of systematic auscultation’.10 The authors, being the Educational Committee of the European Society of Cardiology, expressed a need to ‘reinforce the importance of clinical examination’. In a study based on interviews with general practitioners (GPs) in UK, Germany and France, the GPs in Germany and UK reported auscultation in about two out of five patients, while GPs in France auscultated more than 90% of these elderly patients with symptoms that could be suggestive of VHD.11 Even cardiologists do not seem to give strong priority to heart auscultation. The new guideline for the management of VHD mentions the word ‘auscultation’ only once.12

The objective of this study was to present the diagnostic accuracy of heart auscultation for identifying VHD, as determined in clinical studies where medical doctors have examined adult patients.

Methods

Search strategy

A systematic literature search for diagnostic studies comparing heart auscultation with echocardiography or angiography to evaluate VHD was performed in Ovid MEDLINE and Epub Ahead of Print, In-Process, In-Data-Review & Other Non-Indexed Citations, Daily and Versions (1947 to November 2021) and Embase Classic + Embase (1947 to November 2021). The search was divided in two search strategies (A and B), to reduce noise. The first search did not include ‘echocardiography’ in order to include studies done on auscultation before echocardiography was available, and therefore strictly included studies mentioning ‘murmur’ (figure 1A). The second search included ‘echocardiography’ and the additional search criteria were broadened (figure 1B). The controlled vocabulary of MeSH-terms (MEDLINE) and EMTREE-index (EMBASE) was used when applicable, in addition to free-text word searches in abstract, title and keywords. The search was first performed on 14 February 2018 and updated on 25 November 2021. Full search details are provided in figure 1.

Figure 1

Complete search strategy.

Eligibility criteria

We included diagnostic studies (randomised controlled trials, prospective observational studies, case–control studies, cohort studies and cross-sectional studies) where medical doctors performed a physical examination including auscultation of the heart on adult patients in any clinical discipline (primary and secondary care) of any healthcare centre or hospital around the world. Only published papers in English or Scandinavian languages were included. Case studies, studies on children, evaluation of medications, studies on mechanical valves, murmurs caused by other heart conditions than valve diseases (such as ventricular septum defects) and studies examining differences between handheld vs standard echocardiography, were excluded.

Two reviewers independently went through search A (SA and AHD) and B (HM and AHD) assessing title and abstract for eligibility. The same reviewers went through the full-text articles and decided which articles to include in the review (figure 2). Disagreements between reviewers were solved during a consensus meeting between the two reviewers. If consensus was not achieved, one more person (cardiologist (HS) or GP (SA/HM)) was included in the consensus meeting.

Figure 2

Inclusion and exclusion flow chart.

Data extraction

We used a data extraction table to summarise the results. The first author (AHD) extracted the data relating to author, country, year and study design. The outcome measures were extracted in cooperation with HM. Where possible, we used the raw data from the studies and calculated outcome measures directly. If raw numbers were not given, we used the outcome measures calculated by the authors of the original study. Primary outcomes were information concerning diagnostic accuracy (sensitivity, specificity and likelihood ratios (LRs)).

We defined ‘significant VHD’ as moderate or severe AR or MR, or mild to severe AS. We used these cut-offs in the calculations of the diagnostic accuracy of auscultation, whenever the data in the different studies made this possible. For most studies, we have used what the authors themselves defined as mild, moderate or severe VHD.

Quality assessment of the included studies was done by the first author (AHD) using ‘Quality Assessment of Diagnostic Accuracy Studies’ (QUADAS-2),13 the currently recommended tool for use in systematic reviews to evaluate the risk of bias and applicability of diagnostic accuracy studies. QUADAS-2 includes questions regarding patient inclusion, index test, reference standard, blinding of the examiner, and applicability to the target population.14

Synthesis

Meta-analysis was not considered appropriate for this review because of the wide variability of studies with respect to research design and study population. The results are presented as a narrative synthesis grouped by type of VHD (AS, AR, MR or ‘any VHD’). Some studies provided data on several VHDs; consequently, they may appear in more than one of these groups. We used the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) checklist when writing our report.15

Patient and public involvement

This systematic review did not involve patients or the public in the design, conduct, reporting or dissemination plans.

Results

The two searches resulted in 923 (A) and 1327 (B) articles, respectively. After screening the title and abstracts, 82 (A) and 66 (B) articles were selected for full-text reading. Among the 148 full-text articles, 39 were duplicates. Of the 109 articles read in full text, 22 met the inclusion criteria. Of those, 13 appeared in both searches, 4 were exclusively from search A and five from search B. Reference lists of the included studies were also screened, which resulted in one more eligible study (figure 2). Thus, a total of 23 articles (from the search and the reference lists together) met the inclusion criteria. The oldest of these was published in 1967 and the most recent in 2021. The majority of the studies (n=13) were performed in the USA. The rest were from the UK (three), Switzerland (two), Denmark (two), Canada, Argentina and Lithuania. The study characteristics are presented in table 1.

Table 1

Presentation of included studies

Among the 23 studies, 4 used angiography with ventriculography as reference standard. Those studies were from 1985 to 1988, before echocardiography was firmly established in routine medical practice. One study also performed angiography but used findings during open heart surgery as the reference standard to compare with auscultation. Two of the studies used pulsed Doppler echocardiography (PDE) (published in 1988–1989). PDE measures local blood flow velocities at a specific region, such as the heart valves, without visualisation of the heart. Among the remaining studies, 15 used echocardiography, which has been the reference standard to evaluate heart valve disease since around 1995. Those studies were from 1996 and forward using two-dimensional images to guide the positioning of Doppler measurements. One study, from 2021, used a V-scan (handheld ultrasound device with colour Doppler only), and a focused examination instead of a full echocardiographic examination.

In at least 13 of the studies, the auscultators were cardiologists. In one study, a mixed sample of ancillary personnel, medical students, residents and specialists auscultated. Two of the studies included general internists. Among the rest of the studies, two included GPs and one included emergency department (ED) specialists, and four did not specify who auscultated.

Only three studies were done on a population without symptoms or findings, where echo was done solely as a part of the study and not for a clinical reason. Among those, two studies were from a general population, and one was based on 75 patients with connective tissue disorder (a risk factor for VHD) and 68 healthy volunteers. Nine studies were done on patients referred for echo, either for evaluation of a murmur or because of symptoms. Three studies recruited patients from hospital wards, but not necessarily with known heart disease or murmur. The sample in the remaining eight studies included patients with known heart disease (unstable angina pectoris or VHD).

Risk of bias

The risk of bias in the included studies was assessed as presented in table 2. For almost half of the studies, the enrolment of patients was either not fully described or was not done in a random or consecutive way. Almost all studies blinded the auscultator to the results of the reference standard, and more than half of the studies also blinded the interpreter of the reference standard to the results of auscultation. Most of the studies performed the two tests within an appropriate time interval, often on the same day or within the same week.

Table 2

QAUDAS-2 risk of bias assessment of the included studies

Any valve disease

Core results

Eight studies gave numbers for ‘any valve disease’, and five of those did not calculate sensitivity and specificity for the different VHDs separately (table 3).

Table 3

Diagnostic accuracy of auscultation in unspecified VHD

Altogether sensitivity ranged from 16% to 91%, and specificity ranged from 59% to 100%. LR ranged from 1.45 to 11 (excluding one study with infinite LR). Two studies found LR<2, and except one study with LR=11 none of the studies found LR>5.

Factors affecting sensitivity and specificity

The lowest sensitivity (44%) was observed in a population-based study with GPs as auscultators. Apart from this it is not easy to point out any single factor affecting sensitivity and specificity. Auscultation does not seem to give more information when screening patients with connective tissue disorders, which is a risk factor for valve disease.16 Surprisingly, having more information, such as history, laboratory test results, ECG and chest radiographs, did not improve the ability to differentiate between innocent and pathological murmurs,17 neither did receiving special training nor using a more advanced stethoscope.18 Lesions causing diastolic murmurs seem to be particularly hard to diagnose.19

In the study by Draper et al (2019, UK), where all patients were referred for murmur evaluation, specificity was calculated against murmur judged to be pathological (with ‘flow murmur’ and no murmur treated as normal). Attenhofer et al (2000, Switzerland) also examined patients referred for murmur evaluation, and the examiner had to state if the murmur was ‘functional’, with normal cardiac anatomy, or ‘organic’. The numbers in table 3 are sensitivity and specificity for cardiac examination for ‘significant heart disease’, meaning that also here the functional murmurs are excluded. Both Draper et al and Attenhofer et al included some patients with other cardiac conditions than VHD, such as left ventricular hypertrophy (seven patients) or ventricular septal defect (four patients). Reichlin et al (2004, Switzerland) included patients with murmur presenting to the ED, where the ED physician graded the murmur and stated whether the murmur was ‘innocent’ or indicating VHD. Innocent murmur was regarded ‘normal’ in the calculation of sensitivity and specificity in table 3.

Limitations and comments

Roldan et al (1996, USA) studied the accuracy of the physical examination in asymptomatic subjects, both healthy volunteers and patients with connective tissue disorder, but no symptoms of VHD. In this population they found a 23% prevalence of valve abnormalities and a sensitivity of 70% for the physical examination in detecting these abnormalities. However, only 2/10 subjects with abnormalities not detected by physical examination had more than mild VHD.

Reichlin et al (2004, Switzerland) included patients presenting to the ED. In this study, the physician had access to chart records and other clinical examinations such as ECG and laboratory test results. This may have biased the decision whether a systolic murmur was present. Nevertheless, the authors concluded that the initial clinical evaluation done in the ED by an experienced physician is accurate in distinguishing innocent murmurs from VHD.

Kobal et al (2005, USA) included patients referred to echocardiography for any indication. Clinically significant VHD was defined as≥mild AR, ≥moderate MR, ≥moderate tricuspid regurgitation, AS (valve area ≤1.5 cm2) and mitral stenosis (valve area ≤2 cm2). They found a sensitivity of 50% for the cardiologist to find a significant valvular lesion, and that lesions causing systolic murmurs were more frequently diagnosed than lesions producing diastolic murmurs. The study population had an average of 3.9 findings per patient (valvular and other lesions), and multiple findings could be a clinical confounder for the cardiologists.

For the results of Iversen et al (2006, Denmark), we have used the sensitivity and specificity for ‘any murmur’ in the ‘no extra training’ group, including two types of stethoscopes. The regurgitant lesions had the lowest sensitivities (MR 19% and AR 28%). However, for the collective group of ‘any murmur’ the sensitivity was 71%. They found no significant differences in accuracy of auscultation between the groups with regards to training and type of stethoscope.

The study of Gardezi et al (2018, UK) is one of two studies including patients from a more unselected population, in this case participants in a population study (OxVALVE) which included asymptomatic inhabitants 65 years and older with no previous history of VHD. This was also the only study among the ‘unspecified VHD’ studies where the auscultators were GPs. As many as 20 of the 36 patients with a significant VHD had no murmur. However, the negative predictive value of auscultation to exclude significant VHD was ‘reasonable’ (88%).

In the study by Draper et al (2019, UK), the authors conclude that systematic auscultation or a point-of-care ultrasound (POCUS) scan could reduce the need for standard echocardiology in asymptomatic patients with a murmur.

Aortic stenosis

Core results

Altogether 13 studies included AS. Among these, we were able to calculate sensitivity and specificity of auscultation specifically for AS in eight studies, presented in table 4. The remaining five studies provided numbers for ‘any VHD’ only.

Table 4

Diagnostic accuracy of auscultation in aortic stenosis

Sensitivity ranged from 72% to 97%. Specificity ranged from 28% to 97%. LR ranged from 1.35 to 26, and mean LR was 6.2. One study found an LR higher than 10 (Mehta et al, 2014) and one study found an LR higher than 5 (Steeds et al)—the rest found LRs lower than 5. Two studies found LRs lower than 2.

Factors affecting sensitivity and specificity

Among the four studies where the participants were patients referred for echocardiography for any reason, three used cardiologists as auscultators. There was however no difference in sensitivity and specificity between the cardiologists and the one study using general internal medicine house staff. Parras et al (2015, Argentina) studied if a grade II murmur (classified by a cardiologist) could predict a moderate or severe AS (in patients with known AS), and found a sensitivity of 98%, but a specificity of only 28%. Using murmur grade III or more gave a sensitivity of 95.2 and a specificity of 63.3%.

The included studies have used different cut-off values for defining mild, moderate and severe AS (figure 3). Only McGee (2010) and Chorba et al (2021) used peak jet velocity. In our data from Chorba, we used their cut-off at 3 m/s as the limit for mild AS. For the data from McGee, we were able to use the cut-off at 2.5 m/s, todays lower limit of a mild AS. Valve area was used by several, but with different cut-offs for mild AS: Etchells et al (1998) used <1.2 cm2, Kobal et al and Parras et al (2015) used <1.5 cm2, and Attenhofer et al (2000) used <1.9 cm2. Attenhofer also included those with a mean pressure gradient of >10 mm Hg. The measuring of peak gradient has been removed from the latest definition of AS, but some of the older studies included in this review used this measurement: Iversen et al (2008) had a cut-off at >50 mm Hg for moderate AS, while Etchells et al (1998) used 25 mm Hg as cut-off. Mehta et al (2014) did not specify how they defined moderate/severe AS, neither did Steeds et al (2021).

Figure 3

The different cut-off values for diagnosing aortic stenosis (AS) in the included studies. ESC, European Society of Cardiology.

Limitations and comments

Steeds et al (2021, UK) studied the feasibility of AS screening in patients >65 years in a primary care setting (patients presenting for influenza vaccination). Participating GPs auscultated for murmur and described the murmur as ‘AS specific’ or ‘not AS specific’. The GP then did a target 2D echocardiography using a V-scan ultrasound device (GE Healthcare, Wauwatosa, Wisconsin USA). Depending on the total evaluation the GP decided whether to refer for a regular echocardiography, review the patient in own practice or take no action. Only those referred by the GP had an ordinary echocardiography done; the rest had ‘no final diagnosis’. Among those with a murmur (30 patients), only 15 were referred—13 of those had an abnormal V-scan and 2 had a normal V-scan. Among those without a murmur, 34 had an abnormal V-scan, but only 5 of those were referred for echocardiography; one of which had an AS. The finding of an AS-specific murmur had the highest probability of an abnormal V-scan (n=5, 83.3%).

Aortic regurgitation

Core results

In total 13 studies included AR and among those, 7 gave specific numbers for AR (table 5). The first three included studies (Cohn et al (1967), Meyers et al (1985) and Grayburn et al (1986)) all studied AR alone. Sensitivity ranged from 81% to 100%; however, the latter was from a small study where only three patients had a moderate to severe AR (Meyers et al). Specificity in the three studies only including AR ranged from 78% to 82% (ie, around 20% false positives). Among all the studies, sensitivity ranged from 34% to 100%, and specificity ranged from 78% to 100%. LR ranged from 3.86 to 5.90 with a mean LR of 4.69 (excluding one study with infinite LR). Two studies found an LR higher than 5 (Meyers et al, 1985 and Rahko et al, 1989)—the rest found LRs <5, yet none of them <2.

Table 5

Diagnostic accuracy of auscultation in aortic regurgitation

Factors affecting sensitivity and specificity

Studies including overweight patients, and studies including mild AR, both demonstrated low sensitivities. Kinney (1988) and Rahko (1989), both from USA, studied regurgitant lesions (AR and MR) and found that regurgitant murmurs were frequently not present in regurgitant lesions. Kinney found several factors that were associated with false-negative auscultation: obesity, the absence of cardiomegaly and the presence of chronic obstructive pulmonary disease and coronary artery disease, among others. The sensitivity for detecting moderate or severe AR by auscultation was found to be 64%, in contrast to a sensitivity of 37% for all grades of AR.

Another study reporting low sensitivity only included patients who were overweight, with a body mass index ≥30 kg/m2 (Kalinauskiene, 2019).

Mitral regurgitation

Core results

Twelve studies included auscultation for MR. Of these, 10 provided data on sensitivity and specificity for MR and 3 of those studied MR alone (table 6). Sensitivity ranged from 30% to 100%. Specificity ranged from 50% to 97%. LR ranged from 1.48 to 20 and mean LR was 4.5. Except the one study with LR 20 none of the included studies found LR>5. Two of the studies found LR<2.

Table 6

Diagnostic accuracy of auscultation in mitral regurgitation

Factors affecting sensitivity and specificity

In two of the studies, the sensitivity was particularly low; Kinney (1988) and Codispoti et al (2005). Both were retrospective chart reviews. Codispoti et al studied MR alone and around 20% of the auscultators were medical students or ancillary staff—the rest were residents, fellows or board-certified staff physicians (around 50%).

Discussion

We found that the diagnostic accuracy of auscultation to evaluate VHD varied considerably among the included studies. There was a sparsity of studies from general practice. In general, AS achieved sensitivities that were higher than the other VHDs. Retrospective chart reviews typically noted particularly low sensitivities, as did one study including only overweight participants. The GPs achieved lower sensitivities than cardiologists, but not for detection of AS.

Strengths and limitations

Only two databases were searched, however, these are the largest databases of medical literature, and it is unlikely that this caused a bias in our results. Only Scandinavian and English language were included, and we might have missed studies published in other languages. However, we have included studies from seven different countries (predominantly from North America or Europe).

We defined ‘significant VHD’ as moderate or severe AR or MR, or mild to severe AS. A possible weakness then is that milder degrees of VHD (present, but not clinically significant) would be defined as negative. For most studies, we have used what the authors themselves defined as mild, moderate or severe VHD, and the cut-offs for these definitions will vary depending on the guidelines at the time of publication. This is also a limitation in the ability to compare the different results.

When the Echells study was published in October 1998 the area range for moderate AS was 0.8–1.2 but in an international guideline published in November 1998 this was changed to 1.0–1.5.20 No international body recommends >1.9 as a cut-point for mild as used by Attenhofer et al. Similarly, the discordance between a peak gradient of >50 mm Hg21 and >25 mm Hg22 as a cut-point for moderate AS is noteworthy. As pointed out earlier these differences makes the results of the different studies impossible to compare, and this is a major limitation to the interpretation of the studies.

Risk of bias

Only half of the studies practised a consecutive inclusion strategy, which is considered as the inclusion strategy with least risk of bias. In addition, the patient samples were often selected from patients where a murmur had been heard or in patients referred for echocardiography or with a known heart condition. This probably increases sensitivity for auscultation as the prevalence of more advanced disease is higher in such patient samples. Only two studies were done on an unselected population (Gardezi et al and Steeds et al).23 24

In almost all the included studies the auscultators had no information about the results of the echocardiography or ventriculography. Some of the studies specified that the auscultators also had no information about history and other examinations such as ECG and blood pressure. In six studies this was not the case. This is not necessarily a drawback, since this often is the case in real life—the medical doctor usually has access to more than just the result of the auscultation. However, in several of the studies, it was unclear if the auscultator had information about history and other examinations, and this makes interpretation of the results difficult.

Applicability

As the stethoscope turned 200 years in 2016,25 it has been a subject of discussion whether it has become an obsolete device that medical doctors hang on to for sentimental reasons, given the rapid development of pocket size ultrasound devices. Editor-in-chief of the international journal Heart, Catherine Otto, says’“it’s time to turn to more effective technology—ultrasound, not acoustic sound’ in an editorial from 2018.26 She commented on the study by Gardezi et al, the study in this review with the lowest diagnostic accuracy. She argues that one should start to teach POCUS to healthcare providers, instead of focusing on training in the nuances of heart sounds heard with the stethoscope. However, ultrasound tends to be more operator-dependent than the stethoscope,27 and is still definitely more expensive, thus less accessible for students, new doctors or in low-income countries. Many examinations or tests we do in our daily practice as medical doctors need to be put in a context together with the history, symptoms and other clinical findings. For the GP, there are two important questions when auscultating a patient: What should the GP do when finding a murmur, and when should VHD be suspected? The study answering these questions is still pending. In the meantime, several of the included studies in this review have found that auscultation together with certain other findings (such as intensity of murmur, duration of murmur, presence of second heart tone, delayed carotid upstroke or radiation of murmur) increase the probability of VHD.17 22 28 29 It is also important to remember that some of the included studies, including the study by Gardezi et al, were done in an asymptomatic population (or the auscultator did not have access to information on symptoms). Most GPs will start with the history of the patients, and the presence of symptoms will lead the way to further diagnostic testing, including auscultation. A study from UK published in 2014 showed that in addition to murmur, including atrial fibrillation or any cardiac symptom would greatly increase the number of VHDs detected.30

Participants were selected from different populations in the different studies. The prevalence of VHD varied between 5% and 100%. Many of the studies reporting high sensitivity for auscultation emerge from patient populations with a high prevalence of VHDs. When screening a general population, or asymptomatic patients, the sensitivity declines.

In a paper published in BMJ Evidence-Based Medicine in 2013, with the title ‘Principles for high-quality, high-value testing’,31 the authors suggest a key number to decide if the sensitivity and specificity of a test is ‘good enough’: ‘…For a test to be useful, sensitivity+specificity should be at least 1.5 (halfway between 1, which is useless, and 2, which is perfect)…’. Auscultation falls below 1.5 in several of the studies in this paper, but not all. In the study by Iversen from 2008 with almost 3000 patients, the sum of sensitivity and specificity is 1.62,21 and the results in the study by Draper (2019) sums up to 1.91.32 Adding the results of the LRs only a few of the included studies found LRs>5, which is accepted to generate ‘moderate shifts in pretest to posttest probability’ according to the rough guide in a paper published in JAMA in 1994 and the paper ‘Approach to the patient with a murmur’ published in 2022:33 34 two studies for AS, one for AR, one for MR and one study for any VHD.

In many countries the GP is the ‘gatekeeper’ and the door-opener for further examinations in secondary healthcare. To restrict healthcare expenses, the ability of a GP to know when to refer a patient for echocardiography is important. A systematic review from 2018 examining overtesting and undertesting in primary care concluded that ‘echocardiograms are ordered particularly poorly’35 with a consistent underuse especially in connection with heart failure and atrial fibrillation and a consistent overuse for perioperative assessment and for murmurs when there are no other symptoms or signs of VHD. A study from Norway published in 2013 concluded that echocardiographic screening of a general population (mean age around 60 years old) did not affect mortality or the risk of myocardial infarction and stroke.36 Some countries offer so-called ‘open access echocardiography’ as a diagnostic service, where GPs can refer patients with suspected heart failure or VHD. The echocardiograms are taken by ultrasound technicians. This has proven to substantially reduce referrals to the cardiology department,37 and a study from the Netherlands concluded that GPs treated more patients by themselves following the results from the open access echocardiography compared with those referred to a cardiologist for echocardiography.37 The study comparing auscultation habits in Germany, France and UK concluded that rates of detection of valve disease among GPs need to be improved, and suggested a better availability of ‘focused echocardiography’.11

In a study from the UK in 201438 examining ‘appropriate use criteria’39 40 for transthoracic echocardiography including all Welsh hospitals (n=14), the authors found that only 6.5% of the echocardiography requests came from a GP, and they concluded that 87% of the requests from the GPs were ‘appropriate’.38

Conclusion

Sensitivity and specificity of auscultation varied considerably across the different studies. There is a sparsity of data from general practice, where auscultation of the heart is usually one of the main methods for detecting VHD.

Based on this systematic review, it is difficult to decide the diagnostic utility of auscultation as a clinical examination for VHDs. In general, medical doctors should not rely too much on auscultation alone. More research is needed on how auscultation, together with other clinical findings and history, can be used to distinguish patients with VHD. Future studies on usefulness of auscultation should focus on general practice. These studies should include a broader range of examinations done in primary care, to clarify the role of auscultation in the appropriate selection of patients referred for echocardiography.

Data availability statement

Data are available on reasonable request. Not applicable.

Ethics statements

Patient consent for publication

Acknowledgments

We want to thank the rest of our colleagues at the General Practice Research Unit, Department of Community Medicine, UiT, The Arctic University of Norway, Tromsø, Norway, for useful feedback and good advice along the way.

References

Footnotes

  • Contributors AHD, SA and HM conceived the idea. AHD, SA, HS, ER and HM designed the search, and ER conducted the search. AHD, SA and HM screened search records. AHD and HM extracted data and assessed risk of bias of eligible studies. AHD conducted the analysis. AHD, HM and PH interpreted the data. AHD wrote the first draft of the manuscript. AHD, SA, PH, HS, ER and HM critically revised the manuscript. AHD, SA, PH, HS, ER and HM reviewed and approved the final version. AHD is the guarantor of this manuscript.

  • Funding SA has received funding from The Norwegian Medical Association General Practice Research Fund (grant number not applicable).

  • Competing interests AHD, SA, HS and HM take part in a patency application for an algorithm detecting heart disease from heart-sound recordings (United Kingdom Patent Application No. 2212073.7). HS: Grant from NOVARTIS, joint project with institution. HS: Consulting fees from NOVARTIS and AstraZeneca. HS: Lecture fee from Amgen and Pfizer. HS: Leadership / board member of Norwegian Council on Cardiovascular diseases and Norwegian Council on Dementia (both unpaid).

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.