Objective To determine the diagnostic accuracy of different methods of blood pressure (BP) measurement compared with reference standards for the diagnosis of hypertension in patients with obesity with a large arm circumference.
Design Systematic review with meta-analysis with hierarchical summary receiver operating characteristic models. Bland-Altman analyses where individual patient data were available. Methodological quality appraised using Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS2) criteria.
Data sources MEDLINE, EMBASE, Cochrane, DARE, Medion and Trip databases were searched.
Eligibility criteria Cross-sectional, randomised and cohort studies of diagnostic test accuracy that compared any non-invasive BP tests (upper arm, forearm, wrist, finger) with an appropriate reference standard (invasive BP, correctly fitting upper arm cuff, ambulatory BP monitoring) in primary care were included.
Results 4037 potentially relevant papers were identified. 20 studies involving 26 different comparisons met the inclusion criteria. Individual patient data were available from 4 studies. No studies satisfied all QUADAS2 criteria. Compared with the reference test of invasive BP, a correctly fitting upper arm BP cuff had a sensitivity of 0.87 (0.79 to 0.93) and a specificity of 0.85 (0.64 to 0.95); insufficient evidence was available for other comparisons to invasive BP. Compared with the reference test of a correctly fitting upper arm cuff, BP measurement at the wrist had a sensitivity of 0.92 (0.64 to 0.99) and a specificity of 0.92 (0.85 to 0.87). Measurement with an incorrectly fitting standard cuff had a sensitivity of 0.73 (0.67 to 0.78) and a specificity of 0.76 (0.69 to 0.82). Measurement at the forearm had a sensitivity of 0.84 (0.71 to 0.92) and a specificity 0.75 of (0.66 to 0.83). Bland-Altman analysis of individual patient data from 3 studies comparing wrist and upper arm BP showed a mean difference of 0.46 mm Hg for systolic BP measurement and 2.2 mm Hg for diastolic BP measurement.
Conclusions BP measurement with a correctly fitting upper arm cuff is sufficiently sensitive and specific to diagnose hypertension in patients with obesity with a large upper arm circumference. If a correctly fitting upper arm cuff cannot be applied, an incorrectly fitting standard size cuff should not be used and BP measurement at the wrist should be considered.
Statistics from Altmetric.com
Strengths and limitations of this study
Study quality was assessed using Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS2).
Forest plots of sensitivity and specificity were created for each study and estimates of sensitivity and specificity in receiver operating characteristic space for each comparison were plotted.
Individual patient data were obtained and the Bland-Altman method used to plot differences in measurement against reference standard.
Insufficient evidence was available to compare forearm, wrist, finger blood pressure measurement with invasive blood pressure.
All included studies used body mass index (BMI) >30 as an indicator of obesity. Although BMI generally increases as adiposity increases, due to variation in body composition, it is not the most accurate measure of body adiposity.
Approximately 671 million individuals worldwide are obese and subsequently mean arm circumference has increased.1 ,2 As a result, healthcare professionals are increasingly faced with situations where blood pressure (BP) measurement with a standard (or even large sized upper arm cuff) is not possible.2 This is important because errors in BP measurement are greater when the cuff used is too small relative to the patient's arm circumference.3 In the USA, over 30% of the population have a large arm circumference as a consequence of obesity, a figure which can rise to over 60% in some clinics.4 ,5
Standard cuffs are ∼22–32 cm with large cuffs typically ranging from 32 to 42 cm.6 Where available cuffs are not large enough for an individual's upper arm, clinicians may use a variety of different methods for BP measurement including a correctly fitting extra large cuff, forearm BP, wrist BP, finger BP or ambulatory BP.7 However, there is considerable uncertainty as to the diagnostic accuracy of these different approaches for obese people and hence the optimum alternative test.2 Currently, all of the major hypertension guidelines recommend different approaches (table 1).
Understanding which alternative method to use is important because hypertension is common in patients with obesity and the consequences of missing the diagnosis is potentially severe at the individual and population level.11 This review aimed to determine the diagnostic accuracy of upper arm, forearm, wrist and finger BP measurement compared with the reference standards of invasive arterial, upper arm (correctly fitting cuff) and/or ambulatory BP measurement (ABPM) for the diagnosis of hypertension in patients with obesity with a large arm circumference.
Search methods for identification of studies
MEDLINE, EMBASE, Cochrane database, DARE, Medion and the TRIP database were searched. No language or publication status restrictions were applied. To increase the sensitivity of the search, methodology filters were not used as these have been found to miss relevant studies when searching for diagnostic accuracy studies.12 Reference lists from all primary studies and reviews identified were hand searched.
Studies comparing upper arm, forearm, wrist or finger BP with reference standards of intra-arterial, upper arm (correctly fitting cuff) or ABPM for the diagnosis of hypertension in adult (over 18 years) patients with obesity were included. For each index test, all cuff shapes and sizes were considered. Thresholds for hypertension of 140/90 mm Hg (clinic and intra-arterial measurement) and 135/85 mm Hg (ABPM) were used and both manual and automated BP measurements were considered. Obese adults were defined by either an upper arm circumference ≥35 cm, a body mass index (BMI) ≥30 or by direct measurement of percentage body fat (≥25% for men and ≥30% for women). Prespecified subgroups included those with arm circumference ≥40 cm or ≥50 cm or BMI≥35.
Included study designs were diagnostic cross-sectional, randomised and cohort studies. Studies were excluded if participants were: receiving antihypertensive treatment at the time of comparison or pregnant or a hospital inpatient. Studies from which data could not be extracted were included in the descriptive part of the review but excluded from subsequent analyses.
Selection and data extraction
To determine inclusion or exclusion of each potential study, GI and JH independently reviewed the results of the search by title/abstract and when required by the full text of the study. The resulting list of citations was then reviewed for inclusion and a third author (RJM) arbitrated the final selection decisions.
A standardised data extraction form was used to identify study characteristics and results for each included publication which were independently extracted by GI and JH. Study quality was assessed using Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS2).13 Where possible, data were extracted into 2×2 contingency tables. All authors were contacted directly if insufficient data were available in the published report of a given study and to request individual patient data.
Statistical analysis and data synthesis
Forest plots of sensitivity and specificity were created for each study and estimates of sensitivity and specificity in receiver operating characteristic (ROC) space for each comparison were plotted. The diagnostic performance of each index test was ascertained and heterogeneity in such performance estimated (see below). The Metandi and Midas procedures in STATA V.13.0 were used to fit the hierarchical summary receiver operating curve (HSROC) models (R Harbord. METANDI: Stata module to perform meta-analysis of diagnostic accuracy. Statistical Software Components 2008; Stata Corporation. Stata Statistical Software Release V.13.0: Programming: Stata Corporation, 2001). Where there was a sufficient number of studies (n≥5) the HSROC model was used to derive inferences about diagnostic test accuracy and heterogeneity in test performance including the summary curve.14 Where individual patient data were available, the Bland-Altman method was used to plot differences in measurement against reference standard.15
Investigations of heterogeneity and sensitivity analyses
Heterogeneity was evaluated by visual inspection of Forest plots of sensitivity and specificity, ROC plots of the data and using the χ2 test. The influence of differences in the test characteristics (manual vs automated BP device), study population (primary care vs hospital outpatient) and methodological quality (scores on items of the QUADAS2 checklist) was investigated where it was appropriate to do so.
Where there was evidence of differences between studies sensitivity analyses were undertaken based on:
Measurement of obesity: indirect/direct methods, for example, bioelectrical impedance analysis (BIA) or skin fold thickness.
Risk of bias according to QUADAS2 criteria: low/high risk of bias.
Year of study: <25/≥25 years old.
Reporting bias was assessed using the test for funnel plot asymmetry.16
The design of the review was informed by a search of patient uncertainties in the UK Database of Uncertainties of the Effects of Treatment (DUETS) and patient research priorities identified by the James Lind Alliance.
Results of the search
A total of 4037 studies were identified (excluding duplicates). The full text of 164 papers was reviewed for eligibility (figure 1) and 37 studies were included (32 published in English). One was a randomised trial and one a case–control study;17 ,18 the remainder were cross-sectional studies. Twenty studies (26 comparisons) had extractable data, all of which were of cross-sectional design. There were no disagreements between authors in relation to the number of studies eligible for inclusion (κ=1.0). Individual patient data were available from four authors.
Methodological quality of included studies
Quality assessment of the 20 included studies and the 16 studies that could not contribute data, found no studies that satisfied all the QUADAS2 criteria as all had some degree of methodological weakness and/or lacked reporting clarity (figure 2). Blinding of reference tests, blinding of index tests and acceptable delay between tests were particularly poorly reported.
Six studies were found which used invasively measured BP as the reference standard and a further 17 using a properly fitting upper arm cuff. One study used ABPM as a reference standard. Details of included studies are shown in table 2 and results are presented by reference test below.
Reference test: invasively measured BP
Six studies were found with extractable data (eight data sets) comparing upper arm BP with invasive measurement in obesity yielding a pooled sensitivity of 0.87 (0.79 to 0.93) and a specificity of 0.85 (0.64 to 0.95).18 ,27 ,29 ,32–34 Only three studies were inside the 95% CIs for the summary receiver operating curve (SROC) summary point. The five outlying studies had a relatively small sample size (5–20 participants). Deeks funnel plot asymmetry test had a non-significant p value (p=0.53) for the slope coefficient suggesting no evidence of publication bias. Only one study was included for the subgroup analysis of those patients with BMI>35.27 This resulted in a sensitivity of 0.88 (0.47 to 1.00) and a specificity of 0.71 (0.40 to 0.90). There were insufficient data to perform any other subgroup analyses. Across the six studies there was no difference in reference threshold and all avoided verification bias.
There was no significant evidence of heterogeneity across the six studies (χ2: Q=1.51, df=2, p=0.24). Exclusion of the two studies that were >25 years old (1953 and 1965) and used older equipment resulted in a pooled sensitivity of 0.86 (0.77 to 0.92) and a specificity of 0.90 (0.70 to 0.92) and did not influence heterogeneity.22 ,30 The number of studies was too small to calculate a pooled analysis for other a priori stated potential sources of heterogeneity.
No studies with extractable data regarding forearm, wrist or finger BP in obese individuals compared with invasively measured BP were found.
Reference test: correctly fitting upper arm cuff
Sixteen studies (18 comparisons) were found with a reference standard of correctly fitting upper arm cuff compared with a standard cuff, forearm, wrist or finger BP measurement.
Upper arm BP
Five studies (six data sets) compared a standard sized (ie, too small) upper arm cuff with a correctly fitting upper arm cuff and found a pooled sensitivity of 0.73 (0.67 to 0.78) and a specificity of 0.76 (0.69 to 0.82).19–21 ,26 ,28 The majority of the studies were within the 95% CI of the SROC summary point. The two outliers had a relatively small sample size in comparison to the other studies (n=4 and n=31) and the Deeks funnel plot suggested no evidence of publication bias (p=0.34). Only one study was included in the subgroup analysis of those patients with BMI>35 with resultant wide CIs: sensitivity 1.00 (0.59 to 1.00) and specificity 0.67 (0.22 to 0.96).21 There were insufficient data to perform any other subgroup analyses.
Across the five studies, there was no difference in reference threshold and all studies avoided verification bias. There was no significant statistical evidence of heterogeneity across the six comparisons (χ2: Q=1.62, df 2, p=0.22) but there were insufficient data to carry out a pooled analysis for other potential sources of heterogeneity.
Six studies compared BP measurement at the forearm at the level of the heart with a correctly fitting upper arm cuff and found a sensitivity of 0.84 (0.71 to 0.92) and a specificity of 0.75 (0.66 to 0.83).18 ,24 ,27 ,30 ,35 Four of the studies were within the 95% CI of the SROC summary point. Deeks funnel plot asymmetry test had a non-significant p value (p=0.50) for the slope coefficient suggesting no evidence of publication bias. There was statistical evidence of heterogeneity across the five studies (χ2: Q=8.382, df=2.00, p=0.008) which appeared largely due to two studies with much earlier publication dates.18 ,32 Excluding these studies resulted in a pooled sensitivity of 0.86 (0.77 to 0.92) and a specificity of 0.90 (0.70 to 0.92) and eliminated heterogeneity (χ2: Q=0.68, df=2.00, p=0.36). Further analyses for heterogeneity or subgroups were not possible.
Five studies considered BP measured at the wrist held at the level of the heart compared with a properly fitting upper arm cuff yielding a pooled sensitivity of 0.92 (0.64 to 0.99) and a specificity of 0.92 (0.85 to 0.97).3 ,17 ,21 ,23 ,31 Funnel plot and χ2 tests suggested a low probability of publication bias (p=0.89) or heterogeneity (χ2: Q=4.26, df=2.00, p=0.06), and subgroup analyses were not possible.
One study with two extractable data sets compared finger BP measurement with that from a correctly fitted upper arm cuff and reported a similar specificity (0.57 and 0.61) but a markedly different sensitivity (0.74 and 0.91) from the two comparisons.31 With so few data, it was not possible to calculate a pooled summary estimate, assess for heterogeneity or consider subgroups or publication bias.
A summary ROC plot of all index tests compared with the reference test of a correctly fitting upper arm cuff is provided in figure 3. Visual inspection suggests that measurement at the wrist held at the level of the heart performs best in comparison to the reference.
Individual patient data
Bland-Altman analysis of the available individual patient data for three studies comparing wrist and upper arm BP measurement with a correctly fitting cuff showed a mean difference of 0.51 mm Hg (limits of agreement −21.7 to 22.6 mm Hg; systolic BP) and 1.96 mm Hg (limits of agreement −14.1 to 18.6 mm Hg) (diastolic BP; table 3).17 ,21 ,23
Reference test: ambulatory BP
One study with extractable data compared upper arm BP measurement with that of ambulatory BP in obesity.25 This showed a sensitivity of 0.45 (0.29 to 062) and a specificity of 0.71 (0.60 to 0.81). Given that only a single study was identified, we were not able to assess for statistical heterogeneity or undertake subgroup analyses. No studies with extractable data regarding forearm, wrist or finger BP in obese individuals compared with ambulatory monitoring were found.
Summary of main results
This review has shown that measurement of BP in an obese individual using a correctly fitting upper arm BP cuff compared with the reference standard of direct arterial measurement provides similar results to those obtained in non-obese adults: sensitivity (0.87 (0.79 to 0.93; obese)) versus (0.67 (0.30 to 0.93; non-obese)) and specificity (0.85 (0.64 to 0.95; obese) versus (0.95 (0.83 to 0.99; non-obese); table 4).34 There is insufficient evidence to comment on how forearm, wrist and finger BP testing perform in relation to invasively measured BP in obesity.
Although CIs around the point estimates overlapped, inspection of SROC curves suggested BP measurement at the wrist is the alternative of choice should a correctly fitting upper arm cuff not be available. Bland-Altman analysis of available individual patient data demonstrated good agreement between wrist and upper BP measurement with a correctly fitting cuff for systolic (mean difference 0.51 mm Hg) and diastolic BP (mean difference 1.96 mm Hg). This falls within the ±3 mm Hg (British Hypertension Society (BHS) standard) and ±5 mm Hg (clinically relevant difference) margin of error. However, it is important to note that this analysis was based on a relative small amount of data with wide limits of agreement and that care was taken to minimise the influence of arm–heart hydrostatic pressures by asking the patient to hold the wrist devices at the level of the heart. If this is not performed, then systematic errors in measurement are likely to occur.6
Limitations of the review
For a number of the studies included in this review, data were either missing or could not be extracted directly from the published paper in order to construct the required contingency tables. Authors were contacted directly to obtain these data, and individual patient data made available by three researchers was used to calculate Bland-Altman analyses.
All included studies used BMI>30 as an indicator of obesity which has limitations. Although BMI generally increases as adiposity increases, due to variation in body composition, it is not the most accurate measure of body adiposity.1 Those with a high proportion of muscle mass could have a high BMI. In a similar way arm circumference as a measure of arm obesity may be confounded in the case of an athlete with a low percentage body fat and large, muscular arms. No studies used direct methods such as BIA, skin fold thickness, underwater weighing. In some populations, the percentage of patients without obesity with a large arm circumference arms can be ∼10% (Latman 2013, personal communication).
Some statistical heterogeneity was apparent (forearm vs correctly fitting upper arm cuff) and this could not be explained by reference threshold differences or partial verification. Instead, other factors are likely to have contributed, in particular publication date, suggesting the quality of equipment used may be important. However, it is possible other sources may have contributed including population characteristics (prevalence of hypertension) or test application (operator variability). Where heterogeneity was not detected, it is important to be aware of the limitations of tests for heterogeneity in diagnostic accuracy studies particularly when the number of included studies is small.36
Comparison with the literature
This review is the first, to the best of our knowledge, to summarise the international literature on indirect BP measurement in obesity. This is perhaps reflected in the variation between international guidelines in their recommendations for BP measurement in obesity.1 ,8–10 The American Heart Association currently advocate the use of forearm BP, the European Society of Cardiology advocate wrist BP measurement and the British Hypertension Society suggest contacting manufacturers in order to obtain an extra large upper arm cuff.1 ,8 ,9 The International Society of Hypertension offers no specific advice.10
Clinical and policy implications
On the basis of the review, compared with the reference test of invasive BP, a correctly fitting upper arm BP cuff is sufficiently sensitive and specific for the diagnosis of hypertension in patients with obesity. This also hold true for patients with a BMI>35. If a correctly fitting cuff cannot be fitted or is unavailable, then wrist BP measurement appears to be the next best alternative. BP measurement with an incorrectly fitting standard size cuff on the upper arm should be avoided as it is an insufficiently sensitive or specific test for a diagnosis of hypertension.
All included studies had some degree of methodological weakness and/or lacked of clarity in their reporting. Blinding of reference tests, blinding of index tests and acceptable delay between tests were particularly poorly reported. For this reasons, it is important for any future studies to address these important methodological considerations and follow the QUADAS2 guidance.13 On the basis of the paucity of individual patient data, there is now a need for a large diagnostic accuracy study of high methodological quality comparing wrist/forearm BP measurement with upper arm BP. This should aim to include patients with an arm circumference at least 40 cm. To determine percentage arm adiposity more accurately direct methods such as BIA and skin fold thickness could be employed.
This review set out to identify the diagnostic accuracy of non-invasive BP measurement techniques (upper arm, forearm wrist, finger) for the diagnosis of hypertension in patients with obesity with a large arm circumference compared with three different reference standards: invasive BP, correctly fitting upper arm cuff and ABPM. In conclusion, in the absence of a correctly fitted upper arm cuff, measurement at the wrist appears appropriate, provided that the cuff is held at the level of the heart.
The authors would like to thank those authors who shared study data (Bennett, Berntsen, Bertrand, Chiolero, Cuckson, de Senarclens, Domiano, Doshi, Guagnano, Leblanc, Linfors, Nielsen, Pierin, Poncelet, Schell, Stergiou, Stolt, Stolt, Vinyoles).
Contributors GI and RJM conceived and designed the review. GI and JH screened the references and assessed risk of bias. GI, RS and RJM analysed the data. All authors revised and approved the final version of the review.
Funding GI is funded as an NIHR Clinical Lecturer in General Practice.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.