Objective Low testosterone level may be a reversible risk factor for functional disability and deterioration in patients with chronic obstructive pulmonary disease (COPD). We sought to systematically assess the endogenous testosterone levels and effect of testosterone therapy on exercise capacity and health-related quality of life (HRQoL) outcomes in COPD patients, as well as to inform guidelines and practice.
Design Systematic review and meta-analysis.
Data sources We searched PubMed, Scopus, Cochrane Library, CINAHL, Health Source Nursing and PsychINFO and the reference lists of retrieved articles published before May 2012.
Inclusion criteria Observational studies on endogenous testosterone levels in people with chronic lung disease compared with controls, or randomised controlled trials (RCTs) on testosterone therapy for exercise capacity and/or HRQoL outcomes in COPD patients were eligible.
Data extraction and analysis Data on the mean difference in endogenous total testosterone (TT) values, and the mean difference in exercise capacity and HRQoL values were extracted and pooled using random effects meta-analysis.
Results Nine observational studies in 2918 men with COPD reported consistently lower levels of TT compared with controls (weighted mean difference was –3.21 nmol/L (95% CI −5.18 to −1.23)). Six RCTs in 287 participants yielded five studies on peak muscle strength and peak cardiorespiratory fitness outcomes (peak oxygen uptake (VO2) and workload) and three studies on HRQoL outcomes. Testosterone therapies significantly improved peak muscle strength (standardised mean difference (SMD) was 0.31 (95% CI 0.05 to 0.56)) and peak workload (SMD was 0.27 (95% CI 0.01 to 0.52)) compared with control conditions (all but one used placebo), but not peak VO2 (SMD was 0.21 (95% CI −0.15 to 0.56)) or HRQoL (SMD was –0.03 (95% CI −0.32 to 0.25)).
Conclusions Men with COPD have clinically relevant lower than normal TT levels. Insufficient evidence from short-term studies in predominately male COPD patients suggests that testosterone therapy improves exercise capacity outcomes, namely peak muscle strength and peak workload.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/
Statistics from Altmetric.com
To systematically assess the mean endogenous testosterone level in people with chronic lung disease compared with controls from observational studies.
To systematically assess the effect of testosterone therapy on exercise capacity and health-related quality of life outcomes in chronic obstructive pulmonary disease (COPD) patients from randomised controlled trials (RCTs).
Men with COPD have clinically relevant lower than normal total testosterone levels compared with controls.
Limited evidence from short-term RCT studies in predominately male COPD patients suggest that testosterone therapy improves exercise capacity outcomes, namely peak muscle strength and peak workload.
Strengths and limitations of this study
Key findings were based on a high-quality systematic review and meta-analysis level of evidence.
Since only a small number of studies conducted in specific populations were included, the findings of this review may not be relevant to other countries and key groups, which will require further research.
Chronic obstructive pulmonary disease (COPD) is currently ranked the fifth leading cause of global disability (health loss).1 Health status or health-related quality of life (HRQoL) is a clinically important measurement of disability among patients with COPD for prognostic studies and trials.2–4 Exercise capacity, one of the main determinants of HRQoL, is significantly impaired in COPD patients.3 ,5 Dyspnoea and fatigue due to skeletal muscle dysfunction, among other physiological abnormalities, are cardinal symptoms that limit exercise capacity in COPD patients.6 This is partly due to decreases in muscle strength and mass (often called ‘cachexia’), since they are characteristic features of skeletal muscle dysfunction contributing to exercise intolerance and consequential deterioration in HRQoL.5 ,6 Conversely, pulmonary rehabilitation (PR) including exercise (namely resistance training (RT)) leads to clinically relevant improvements in muscle strength and HRQoL,7 ,8 indicating that skeletal muscle dysfunction should be a primary therapeutic target for intervention in patients with COPD.
Since the testosterone level has been shown to be positively associated with muscle strength and cardiorespiratory fitness accounting for physical activity and muscle mass,9 ,10 a low testosterone level may be an independent risk factor for functional disability and deterioration in COPD. For example, the levels of testosterone and other androgenic hormones were decreased in male and female COPD patients compared with controls in a few studies.11–13 The potential mechanisms for this endocrine dysfunction most likely involve hypoxaemia, hypercapnia, systemic inflammation and the use of glucocorticoids.14
Thus, it is important to reliably establish whether the mean endogenous testosterone level is decreased in patients with COPD, because this condition is reversible with testosterone supplementation therapy. Indeed, a small but promising body of randomised controlled trial (RCT) evidence suggests that testosterone therapy improves exercise capacity and HRQoL without increasing serious adverse events.15–17 While it is difficult to explain this apparent therapeutic benefit, increased cardiac output,18 haemoglobin and haematocrit,19 baroreflex sensitivity20 and exercise tolerance due to improvements in peak oxygen uptake (peak VO2) and muscle strength20 are all plausible mechanisms.
However, our initial analysis of the available published literature indicates an absence of a systematic review of relevant studies on endogenous testosterone levels and testosterone therapy in patients with COPD. We therefore sought to systematically review previous research to assess the mean endogenous testosterone level in people with chronic lung disease compared with controls, and the effects of testosterone therapies on exercise capacity and HRQoL outcomes in COPD patients, to inform guidelines and practice.
We searched PubMed, Scopus, Cochrane Library, CINAHL, Health Source Nursing and the PsychINFO electronic databases for articles published before May 2012. Search syntaxes were developed in consultation with an experienced university research librarian taking into account a broad range of terms and phrases used in definitions of testosterone and COPD (full electronic search strategies for the PubMed, Scopus and Cochrane Library databases in online supplementary appendix pages 1 and 2). Reference lists of potentially eligible articles were searched by hand to identify additional studies missed by our search strategy.
One reviewer (EA) identified potentially relevant studies for inclusion by screening titles and/or abstracts of all citations identified with our database searches. A second screening was performed on the full text of these articles. Observational studies in adult populations that reported endogenous testosterone levels in men and/or women (separately) with chronic lung disease (cases) compared with controls, or RCTs that reported the effects of testosterone treatment on exercise capacity and HRQoL outcomes in COPD patients were eligible. There were no language restrictions for articles.
Data extraction and quality assessment of included studies were performed and/or verified independently by three reviewers (EA, BC and SS). Discrepancies were resolved through discussion. Authors of relevant studies were contacted, where possible, for data that could not be extracted from the published articles.
For methodology and quality assessment, quality checklists were developed to identify potential sources of bias (tables in online supplementary appendix pages 3 and 4). Quality items for observational studies reviewed were (each worth 1 numerical point) as follows: (1) COPD or chronic lung disease was reported to have been clinically diagnosed or categorised according to the WHO International Statistical Classification of Diseases and Related Health Problems (ICD) system, (2) endogenous testosterone level was measured by radioimmunoassay or liquid chromatography-tandem mass spectrometry, (3) the study population was representative of the clinical setting or community (ie, demographic characteristics of cases and hospital controls were typical and community cases or controls were randomly selected) and (4) there was adequate adjustment or exclusion or matching for covariates known to be associated with COPD and hypogonadism in men (each worth 0.2 numerical point): (a) age, (b) socioeconomic or partner status, (c) central or general obesity, (d) smoking status, (e) alcohol intake, (f) physical activity, (g) depression or anxiety (or medications), (h) metabolic syndrome or cardiovascular disease (or medications), (i) systemic inflammation (or glucocorticoids) and (j) sleep apnoea (or treatments).
Quality items for RCT studies reviewed were (each worth 1.0 numerical point) as follows: (1) study eligibility criteria were adequately described, (2) randomisation methodology was adequate (ie, evidence suggesting that the ‘random’ method was used to generate and implement random allocation sequence), (3) allocation concealment was adequate (ie, evidence to suggest that a robust method was used for concealing the sequence of treatment allocation (eg, independent IT or telephone service or sealed opaque envelopes only opened in front of the participant), (4) between-group prognostic indicators were balanced (ie, evidence showing that groups were similar at the outset for these prognostic indicators), (5) care providers were blinded to treatment allocation, (6) between-group drop-out rates were balanced, (7) intention-to-treat analysis was included and (8) adverse events were reported.
Our quality checklist scales were designed based on criteria for assessment of observational studies21 and RCTs21 ,22 and allowed summed scores to range from 0 to 5 points and from 0 to 8 points, respectively, reflecting lowest to highest quality. Studies were considered ‘better quality’ if they received a score of 3 or higher for observational studies and of 5 or higher for RCTs, since that meant that they had most of our quality items.
The primary outcomes were the mean difference in endogenous total testosterone (TT) values between the case and control groups for observational studies (the most frequently reported testosterone outcome in relevant studies), and the mean difference in exercise capacity and HRQoL values after intervention (post-treatment) between the treatment and control groups for RCTs. Where necessary for observational studies, we estimated the mean and variance from the median, range and sample size using methods which have been shown to be reasonably robust in non-extreme circumstances.23 Where necessary for RCTs, the post-treatment means were derived from the within-group changes and the control group SD carried forward from the baseline values.24 Standardised mean differences (SMDs) were calculated using Glass’ Delta method. Exercise capacity outcomes included any assessment of cardiorespiratory fitness and peripheral skeletal muscle strength. Where multiple cardiorespiratory fitness outcomes were reported, we first chose peak VO2 measures and then prioritised peak workload (power output) laboratory assessments of cardiorespiratory fitness over field tests. Where multiple muscle strength outcomes were reported, we prioritised peak isometric measures over peak dynamic measures and knee extension over other joint movements. HRQoL outcomes included any patient-reported assessment of health status or functional disability. Where multiple HRQoL outcomes/scales were reported, we first chose summed score scales, and then prioritised subscales that measure ‘fatigue’ symptoms, and the most frequently reported HRQoL outcome in the other studies reviewed.
Three reviewers (EA, BC and SS) independently collated and/or verified extracted data to present a descriptive synthesis of important study characteristics and a quantitative synthesis of effect estimates.
The secondary outcomes were data about adverse events reported in RCTs for a descriptive synthesis.
We pooled and weighted studies first using random effects meta-analysis models, and second using fixed effects models for verification.25 Where necessary, we standardised laboratory values for endogenous TT levels between observational studies using the International System of Units (SI units), expressed in nanomoles per litre (nmol/L). These studies were then pooled to estimate the inverse variance weighted mean difference (WMD), including the DerSimonian and Laird 95% CI, between cases and controls. Where papers presented medians without means, we estimated the missing mean as being equal to the median for meta-analysis.23
In examining the effects of testosterone treatment on exercise capacity and HRQoL outcomes, the SMD from each RCT was pooled to produce an overall estimate of effect, and associated 95% CI, between the treatment and control groups. For each meta-analysis model, the degree of heterogeneity in SMDs was assessed by visual inspection, the I2 statistic (moderate being <50%26) and the χ2 test of goodness of fit.27 Where evidence of heterogeneity was observed, we checked data extracted from individual outlier studies, qualitatively investigated reasons for their different results and explored the effects of study exclusion in sensitivity analyses.
We also used sensitivity analysis to investigate the robustness of the meta-analyses models. We variously excluded RCTs in men and women, placebo only (rather than placebo with exercise) controlled trials, longer duration trials (≥12 weeks) and studies of lower quality (score <3 for observational studies; score <5 for RCTs). And we repeated the meta-analysis models using different cardiorespiratory fitness outcomes. Publication bias, which reflects the tendency for smaller studies to be published in the literature only when the findings are positive, was assessed visually using funnel plots.28 All calculations were performed in Stata V.12 (StataCorp, College Station, Texas, USA) using the ‘metan’ and ‘metafunnel’ commands. A two-tailed p value <0.05 was considered statistically significant throughout the analyses.
Figure 1 presents a flow chart summarising the identification of potentially relevant studies, as well as those included and excluded (see online supplementary appendix page 8). Our search strategy identified 906 citations after the duplicates were removed. Of these, 865 citations were excluded after the first screening of titles and/or abstracts for inclusion and exclusion criteria, leaving 41 citations for a second full text screening. Hand searching the reference lists of these articles identified two additional potentially relevant citations. After further assessment of these 43 citations, 28 were excluded for reasons listed in figure 1, leaving 15 for final inclusion in the systematic review. Most studies were excluded for inadequate predictor or outcome variables, or for not having a control group (list of excluded citations; see online supplementary appendix pages 5–7).
Descriptive data synthesis
Table 1 presents the study characteristics of nine observational studies included for review, which were published between 1981 and 2011. Studies were conducted in Scotland,29 ,30 Sweden,31 the USA,32 Taiwan,33 Greece,12 Turkey,13 Norway34 and Belgium.35 The degree of severity of airflow limitation in COPD cases ranged from mild-to-very severe, assessed according to the Global Initiative for Chronic Obstructive Lung Disease criteria36 in four studies,12 ,13 ,33 ,35 and by various spirometry criteria in four29 ,30 ,32 ,34 of the five remaining studies. Control participants were recruited from primary care settings in six studies.12 ,29 ,30 ,32 ,33 ,35 The sample sizes ranged from 16 to 213, resulting in a total of 2918 participants across studies. The mean age of the samples ranged from 50 to 71 years. All the observational studies were conducted in men. The mean quality scores ranged from 2.2 to 4, and four studies received a score of 3 or higher.12 ,13 ,32 ,34
Table 2 presents the study characteristics of six RCTs included for review, which were published between 2003 and 2011. Studies were conducted in the USA,37 The Netherlands,38 Brazil,39 France,16 Canada40 and Norway.41 Major inclusion criteria were stable COPD or chronic respiratory failure in all studies, various spirometry criteria in all but one study,16 low TT in only one study37 and low body mass index in only two studies.16 ,39 Major exclusion criteria were a range of chronic conditions in all studies, prostatic conditions in four studies16 ,37 ,39 ,40 and elevated haemoglobin in one study.37 The sample sizes ranged from 16 to 122, resulting in a total of 287 participants across studies. The mean age of the samples ranged from 66 to 69 years. All but two studies16 ,40 were conducted in men only. The baseline mean TT levels ranged from 9.6 to 21.6 nmol/L for men, and from 0.42 to 0.45 nmol/L for women as reported in one study.16 The testosterone therapies used were oral testosterone undecanoate in one study,16 oral stanozolol after a baseline intramuscular injection of testosterone in another study39 and intramuscular injections (testosterone enanthate37 ,41 and nandrolone decanoate38) in all the remaining studies. Four studies investigated the combined effects of testosterone therapy with RT37 or PR.16 ,39 ,40 All but one study16 used placebo control conditions. Trial durations ranged from 8 to 27 weeks. Primary outcomes were peak muscle strength in five studies (from four citations16 ,37 ,38 ,40), peak VO2 in five studies (from four citations37–40), peak workload in five studies (from four citations16 ,37 ,38 ,40), 6 min walking test (6MWT) in four studies16 ,39–41 and HRQoL in three studies.16 ,38 ,40 The mean quality scores ranged from 4.5 to 6, and all but one study40 received a score of 5 or higher.
Quantitative data synthesis
Effect of COPD exposure on an endogenous TT level
Figure 2 presents WMD on an endogenous TT level between the case and control groups for observational studies (see online supplementary appendix page 9). Men with COPD had significantly lower levels of TT compared with controls (pooled WMD was –3.21 nmol/L (−5.18 to −1.23)). There was a high degree of heterogeneity between studies (I2=81.9%, p<0.001) that was mostly a result of variation in the degree of deference rather than an unfavourable direction towards the null. The sensitivity analyses presented in table 3 show that the pooled WMD was substantially changed after exclusion of lower quality studies (increased to –3.68 (−7.00 to −0.36)) and one large sample size study34 (increased to –3.56 (−5.63 to −1.49)). Finally, for the one study33 which provided unadjusted mean differences and mean differences adjusted for age, waist circumference and smoking status, a model using unadjusted rather than adjusted values decreased the pooled WMD to −2.95 (−4.63 to −1.27). In addition, a funnel plot was produced which showed only slight evidence of publication bias, since WMD in TT was small (−0.6033 and −1.10 nmol/L34) for two of the largest studies (figure 3; see online supplementary appendix page 10).
Effect of testosterone therapy on exercise capacity and HRQoL outcomes
Figure 4 presents SMD in peak muscle strength outcomes after testosterone therapy between the treatment and control groups for RCTs (see online supplementary appendix page 11). Testosterone therapies significantly improved standardised peak muscle strength outcomes compared with control conditions (pooled SMD was 0.31 (0.05 to 0.56)), and there was little evidence of statistical heterogeneity between studies (I2=0%, p=0.839). The sensitivity analyses presented in table 4 shows that the pooled SMD was similar after exclusion of one lower quality study40 (0.31 (0.04 to 0.57)), but was substantially changed after exclusion of two placebo only controlled studies (no longer statistically significant 0.30 (−0.01 to 0.62)), and the two studies in men and women that were also the two longer duration studies16 ,40 (decreased to 0.21 (−0.18 to 0.60)). In addition, a funnel plot was produced which showed only slight evidence of publication bias, since SMD in peak muscle strength outcomes was consistent in all but one treatment arm in one study37 (figure 5; see online supplementary appendix page 12).
Figure 6 presents SMD in peak VO2 outcomes after testosterone therapy between the treatment and control groups for RCTs (see online supplementary appendix page 13). Testosterone therapies consistently failed to show significant improvements in standardised peak VO2 outcomes compared with control conditions (pooled SMD was 0.21 (−0.15 to 0.56); I2=4.8%, p=0.379). The sensitivity analyses presented in table 5 shows that this null effect was similar after exclusion of one lower quality study40 (0.13 (−0.27 to 0.54)), two placebo only controlled studies (0.03 (−0.60 to 0.66)), one study in men and women40 (0.13 (−0.27 to 0.54)), and two longer duration studies39 ,40 (0.27 (−0.12 to 0.67)), and in the model using 6MWT outcomes (0.10 (−0.34 to 0.53)). Conversely, testosterone therapies significantly improved cardiorespiratory fitness in the model using peak workload rather than peak VO2 outcomes (pooled SMD was 0.27 (0.01 to 0.52)), and there was little evidence of statistical heterogeneity between studies (I2=0%, p=0.741).
Figure 7 presents SMD in peak HRQoL outcomes after testosterone therapy between the treatment and control groups for RCTs (see online supplementary appendix page 14). Testosterone therapies consistently failed to show better standardised HRQoL outcomes compared with control conditions (pooled SMD was –0.03 (−0.32 to 0.25); I2=0%, p=0.934). The sensitivity analyses showed that this null effect was comparable in the fixed effects model (−0.03 (−0.32 to 0.25)) and after exclusion of one lower quality study40 (−0.04 (−0.34 to 0.25)).
Two RCTs showed that testosterone therapy was associated with more serious adverse events compared with the control group. One study reported an increased number of exacerbations during short-term, but not long-term, follow-up,16 and another study reported that two of three COPD patients with respiratory failure in the treatment group had died.38 Conversely, one study reported that more patients died of respiratory failure in the control group.39 Four studies showed that testosterone therapies decreased gonadotrophin levels compared with controls, as can be expected.16 ,37 ,39 ,41 Compared with controls, testosterone therapy was associated with a decrease in the sex hormone-binding globulin level in two studies,16 ,41 and a decrease in the oestradiol level in men in another study.16 Finally, few studies showed that testosterone therapy was associated with relative increases in haemoglobin or haematocrit;16 ,37 ,38 creatinine, aspartate aminotransferase and lactate dehydrogenase values.38
Summary of evidence
We have established that men with COPD have significantly lower levels of endogenous TT compared with controls (WMD was −3.21 nmol/L (−5.18 to −1.23)). The size of the mean difference in TT level, which ranks men with COPD in the second quartile (below average) compared with age-matched population norms,9 is likely to be clinically relevant. For instance, comparable or greater differences in TT levels between cases and controls have been reported in studies on risk of type 2 diabetes (WMD was −2.66 nmol/L (−3.45 to −1.86)),42 metabolic syndrome (WMD was −2.64 nmol/L (−2.95 to −2.32))43 and clinically significant depression (median difference was –1.21 nmol/L, p<0.001 for Mann-Whitney test).44 These comorbidities have been shown to adversely affect COPD prognosis,45–47 and would further complicate COPD management. As the effect of COPD exposure on TT level increased in size after exclusion of lower quality studies and one large sample size study, future higher quality studies will most likely strengthen rather than weaken this evidence base. Collectively, our results and the existing literature indicate that testosterone deficiency should be considered in men with COPD.
On the basis of the limited short-term RCT evidence in predominately male COPD patients, our results suggest that testosterone therapy significantly improves several exercise capacity outcomes. The size of the effect of testosterone therapy that can be expected in practice is small to moderate, but comparable to exercise or PR therapies alone.7 ,8 The effect of testosterone therapy on standardised muscle strength outcomes remained robust after exclusion of one lower quality study, but weakened after exclusion of two placebo only studies. This supports the hypothesis that testosterone therapy with exercise is more effective than testosterone therapy alone for functional improvements.48 In addition, our results suggest that the mechanism for improvement in cardiorespiratory fitness assessed by peak workload is most likely explained by better exercise tolerance due to testosterone-induced increases in muscle strength rather than changes in VO2.
Several limitations require careful consideration. Since only a small number of studies conducted in specific populations were included, the findings of this review may not be relevant to other countries and key groups, which will require further research. In particular, most of the RCTs were conducted in COPD patients without cardiovascular disease and/or diabetes or endocrine disease, which are highly prevalent in this population group.46 Second, we replaced missing data points with estimates in some instances, which introduced further uncertainty. This includes estimating the mean from the median and range and carrying forward the preintervention SD of control groups where the postintervention statistic was not available. Third, because only a few RTCs targeted COPD patients who would have theoretically benefited most from testosterone therapy such as those with low testosterone or body weight,16 ,37 ,39 our estimated effect size for improvement in standardised exercise capacity may have been underestimated. Finally, the reviewer-level limitations include incomplete retrieval of information for several of the 28 citations excluded, as well as the existence of other relevant studies not identified with our search strategy, resulting in selection bias. However, the results and conclusions reported in most of the excluded studies were in line with those reported here, and selection bias was unlikely.
Nevertheless, our systematic analysis of the existing literature revealed that there is an absence of sufficient RCT evidence to draw firm conclusions about the long-term benefits and risks of testosterone therapies for exercise capacity and HRQoL outcomes in male or female COPD patients, or about the pharmacological dosing for specific testosterone therapies needed for effectiveness. Reliable information on the efficacy and safety, as well as cost-effectiveness, of specific testosterone therapies is required to inform clinical practice guidelines for COPD. In addition, future high-quality epidemiological research is needed to determine which subgroups of COPD patients are most vulnerable to testosterone deficiency, and to reliably establish whether women with COPD likewise present with significantly lower levels of TT than controls.
Men with COPD have clinically relevant lower than normal endogenous TT levels, and we believe that our meta-analytic results are sufficiently reliable to recommend that clinicians should consider testosterone deficiency in these patients. Although our results also suggest that testosterone therapy improves several exercise capacity outcomes, there is an absence of sufficient RCT evidence to draw firm conclusions about the long-term benefits and risks of testosterone therapy for exercise capacity and HRQoL outcomes in male or female COPD patients.
We are grateful to Mr Geoffrey Lattimore for his work on developing and conducting the electronic database searches.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Data supplement 1 - Online appendix
Contributors EA is the guarantor of the paper, taking responsibility for the integrity of the work as a whole, from inception to published article. He conceived and designed the review, identified studies for inclusion, extracted and interpreted the data, and also drafted the article. PF analysed and interpreted the data, and also revised the article. BC and SS extracted and interpreted the data, and also revised the article. GW interpreted the data and also revised the article. All authors approved the final completed article.
Funding This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests EA has entered financial agreements to speak at events for Eli Lilly Australia Pty Ltd (Lilly). BC has received speaking fees and/or conference support from GSK, Novartis and Boehringer Ingelheim. GW has received speaking fees and research support from Bayer; he is on International and National advisory boards and has received research support from Lilly; he has also received consulting fees and research support from Lawley pharmaceuticals.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.