Statistics from Altmetric.com
Strengths and limitations of this study
Network meta-analysis enabled us to integrate evidence from direct comparisons (treatments compared head to head within a randomised trial) and indirect comparisons (treatments compared by combining the results of randomised trials with common comparators).
This network meta-analysis only included randomised controlled trials and the risk of bias in each included study had been comprehensively assessed by using the Cochrane Collaboration's Risk of Bias tool, which strengthens the robustness of evidence synthesis.
The number of available randomised controlled trials was small which could be a limitation of the study.
Prostate cancer is a worldwide major public health issue.1 Nearly 75% of diagnosed cases, however, occur in developed countries,2 where it is typically the most common cancer in men.3 ,4 In the UK, about 40 000 men are diagnosed with prostate cancer and 10 000 men die from it every year.3 In the USA, there are 240 000 new diagnoses of prostate cancer, with 34 000 associated deaths every year.5 Most patients with prostate cancer are diagnosed at an early stage,6 ,7 and many diagnoses are made in asymptomatic men.8–10
The main treatment options for localised prostate cancer include radical prostatectomy, external beam radiotherapy and observational management (ie, regular testing of clinical, biochemical or radiological markers or as prompted by occurrence of symptoms).8 As some of these treatments are associated with substantial risk of side effects, it is important to try to resolve the current uncertainty about the optimal treatment options.
Some randomised trials have compared the efficacy and safety of two or three treatments. For example, the SPCG-4 trial in Europe and the PIVOT study in the USA compared radical prostatectomy with observational management.11 ,12 The UK Prostate Testing for Cancer and Treatment (ProtecT) trial is evaluating treatment effectiveness of active monitoring, radical prostatectomy and external beam radiotherapy for clinically localised prostate cancer in men aged 50–69 years identified through population-based prostate-specific antigen (PSA) testing.13 The recruitment phase for the ProtecT trial, which began in 1999, has been completed, but outcomes will not be available until a minimum follow-up period has been accrued.
It is unlikely that any single trial will compare all available treatment options. We therefore performed a network meta-analysis based on a systematic review of completed randomised trials comparing different interventions for patients with localised prostate cancer. The network meta-analysis allowed us to integrate evidence from direct comparisons (treatments compared head to head within a randomised trial) and indirect comparisons (treatments compared by combining the results of randomised trials with common comparators).14–16 Our objective was to apply the established methodology used in network meta-analysis to an area of clinical practice where no such previous studies existed. In doing so, our aims were to summarise existing evidence; ‘map out’ current gaps in comparative evidence to help motivate the design and conduct of future comparative studies and develop an approach ‘primed’ for subsequent updating and incorporation of future trial evidence.
We sought completed randomised trials in men with localised prostate cancer that had compared two or more of the following interventions (as primary treatment, with or without the same adjuvant therapy in all arms): prostatectomy, radiotherapy including brachytherapy, cryotherapy, high-intensity focused ultrasound (HIFU) and observational management. Observational management is characterised by testing of clinical, biochemical or radiological markers of disease progression at regular intervals (typically every 6 months) or as prompted by the occurrence of new symptoms, possibly leading to either radical or palliative treatment. We opted to use the term ‘observational management’ in preference to active surveillance or active monitoring because the latter terms typically aim to keep men in a window of curability so that only those who require it undergo radical treatment.
Eligible trials had to have reported any of the following efficacy and safety outcomes: all-cause mortality, prostate cancer mortality and gastrointestinal (GI) or genitourinary (GU) toxicity. Studies comparing treatment combinations or sequences (eg, per protocol management by surgery with subsequent radiotherapy) were excluded.
Identification of studies
We adopted the search strategy of a systematic review that supported the development of clinical guidelines on the diagnosis and treatment of prostate cancer by the UK's National Institute for Health and Clinical Excellence (NICE) in 2008.8 Studies had been identified by searching MEDLINE (in 2006) and scanning reference list of papers. We retrieved all relevant randomised trials identified in the NICE guidelines and implemented the same search strategies to update the collection of trials. We restricted the search to the period from January 2005 to September 2012. No language limits were placed on the searches (see online supplementary appendix 1 for full search strategies).
Two reviewers (TX and RMT) independently screened all the titles and abstracts of the studies retrieved by the searches for potentially eligible trials, and then independently assessed the full articles of these trials to confirm whether they met the eligibility criteria. The results were checked and discussed by TX and RMT to agree on a final list of included studies. Using a structured and piloted data collection form, all relevant data in each included paper were extracted by two reviewers independently (TX and RMT/YW). The data extracted were cross-checked and unresolved discrepancies were referred to a third reviewer; where necessary, problems were discussed in a panel meeting (TX, RMT, YW, JPTH and GL) while DEN acted as a clinical expert advisor.
For each included study, we extracted characteristics of participants and interventions, outcomes reported and collected, sample size (randomised and analysed) in each arm, numerical results, losses to follow-up and details of patients excluded from the analyses.17 To inform the appropriateness of including studies in the meta-analysis and facilitate assessment of the strength of the evidence we assessed the risk of bias in each included study using the Cochrane Collaboration's Risk of Bias tool.18 Two reviewers (TX and either RMT or YW) completed this independently and agreed on final assessments. The tool assesses risk of bias arising from inadequacies in processes of generation of the random allocation sequence, concealment of the allocation sequence and blinding and from incomplete outcome data and selective outcome reporting.
We analysed all-cause mortality and cancer-related mortality at 5 years, late GI and late GU toxicity at 3 years. The choice of these follow-up times was pragmatic, as they were the ones most frequently reported in the included trials. Once these time points had been chosen, we extracted the outcome data from the time nearest to these targeted measurement times. Late GI and late GU toxicity were defined as scores ≥2 measured by the Radiation Therapy Oncology Group (RTOG) questionnaire scale at 3 years follow-up.19 We have not encompassed biochemical or clinical failure as operational definitions of either of those outcomes tend to be specific to different radical treatment modalities.20
Initially, we compared each pair of treatments using direct evidence alone, for each outcome. Separate meta-analyses were performed for each pair-wise comparison of interventions: a random-effects model was fitted within each comparison,21 with a common between-study heterogeneity variance assumed across comparisons to allow for heterogeneity even when only a single study was available. Results are reported as ORs with 95% CIs, for every comparison evaluated directly in one or more studies.
Next, we fitted a network meta-analysis model for each outcome separately,22 combining direct evidence for each comparison (eg, from studies comparing interventions A with B) with indirect evidence (eg, from studies comparing A with C and studies comparing B with C), for all pair-wise comparisons simultaneously. The model accounts explicitly for the binary nature of each outcome using a binomial likelihood function; allows for heterogeneity of treatment effects between trials of the same comparison (assuming the same amount of heterogeneity for each comparison, irrespective of how many trials address it) and enforces an underlying relationship between direct and indirect evidence for a particular comparison, assuming these are consistent between the two sources. For each ‘loop’ of treatment comparisons from three or more independent sources and for each outcome, we computed the difference between estimates from direct and indirect evidence on the log OR scale.23 This provides a measure of inconsistency between the different sources. We did not implement more sophisticated methods for testing or adjusting for inconsistency due to the small number of loops in the network.
Results are reported as ORs with 95% credible intervals for all pair-wise comparisons of interventions. All analyses were performed within a Bayesian framework, using Markov chain Monte Carlo methods in WinBUGS (MRC Biostatistics Unit, Cambridge, UK).24 Informative prior distributions were used for the heterogeneity variance, from a published set of distributions for heterogeneity expected in meta-analyses examining particular intervention and outcome types,25 since heterogeneity is imprecisely estimated when the number of studies is small. For all-cause mortality, a log-normal (−3.93, 1.512) distribution was used. For GI and GU toxicity, a log-normal (−2.01, 1.642) distribution was used. For cancer-related mortality, a log-normal (−2.89, 1.912) distribution was used. Vague N (0, 104) priors were used for all other model parameters. Results were based on 100 000 iterations, following a burn-in of 20 000 iterations.
For each outcome, we estimated the probability that each intervention is superior to all others, the second best, the third best and so on, from the rank orderings of the treatments at each iteration of the Markov chain. These ranking probabilities were used to calculate a summary numerical value: the Surface Under the Cumulative RAnking curve (SUCRA).26 SUCRA values are expressed as percentages; if an intervention is certainly the best, its SUCRA value would be 100%, and if an intervention is certainly the worst, its SUCRA value would be 0%. If all interventions are equivalent, we would expect all SUCRA values to be near 50%. We also report the median ranks and 95% credible intervals for each intervention.
Included studies and interventions
The NICE systematic review8 had identified 20 reports relating to 14 randomised trials.27–46 Our updated searches retrieved 1740 studies and identified 39 reports of relevant randomised trials, of which 30 had not been included in the NICE review (figure 1).47–76 One of these reports was the sole report of a trial providing data only on acute toxicity,41 one paper reported only clinical failure39 and one paper reported biochemical failure, biochemical disease-free survival and quality of life57; these three studies were then excluded since they did not report the outcomes of interest to us. In addition to the remaining 47 full papers from peer-reviewed journals, we identified and included in the analysis data from a conference abstract, describing a randomised trial comparing external beam radiotherapy versus watchful waiting,77 and reporting data on long-term mortality not previously reported in full-text-related publications.78 ,79
Our searches also identified 16 relevant systematic reviews.80–95 We scrutinised the reference lists of all these as well as any further systematic reviews identified by the NICE review, and found no further relevant randomised trials.
The 48 identified reports described 21 randomised trials comparing the effectiveness of different treatments for localised prostate cancer.27–38 ,40 ,42–56 58–77 Seventeen trials reported all-cause mortality, 16 trials reported cancer-related mortality, 16 trials reported GI toxicity, 15 trials reported GU toxicity. The characteristics of included studies are summarised in online supplementary appendix 2.
The risk of bias assessments for the included trials is illustrated in figure 2. Most of the evidence was of moderate-to-good quality. About half of the studies did not report adequate information about allocation sequence generation and allocation sequence concealment. Unblinded designs were used in all trials included; we judged this unlikely to cause bias for objectively measured outcomes such as mortality, but generate bias in the reporting and assessment of patient-reported toxicity outcomes. The small number of studies precluded the investigation of potential reporting biases across studies (eg,using funnel plots). Our searches were appropriate, but the possibility of publication bias cannot be excluded. It is unclear, however, whether reporting biases would tend to favour any particular treatment (see online supplementary appendix 3 for details of bias assessments for included trials).
We categorised the interventions into the following eight categories: observational management, prostatectomy, conventional radiotherapy (refers to two-dimensional external beam radiation therapy), conventional radiotherapy hypofractionated (refers to less than 20 fractions), conformal low dose (LD) radiotherapy (refers to less than 68 Gy), conformal high dose (HD) radiotherapy (refers to more than 74 Gy), conformal LD radiotherapy hypofractionated and cryotherapy. Twenty trials had two intervention arms. One trial compared three interventions55; since two of the three interventions were very similar and both met our definition of conformal LD radiotherapy hypofractionated, we combined the data from these two arms and regarded the trial as a two-treatment comparison (conformal LD radiotherapy hypofractionated vs conformal HD radiotherapy). None of the reviewed studied assessed brachytherapy and HIFU. Figure 3 illustrates the full network of comparisons. There were two closed loops of comparisons, one connecting prostatectomy, observational management and radiotherapy modalities; and the other connecting different radiotherapy modalities.23 No inconsistency was detected in our estimates of the difference between direct and indirect evidence; however, precision was very low. Cryotherapy only had a single link to the network.
All-cause mortality was reported in 17 trials, covering all the eight interventions of interest. There is no evidence of superiority of any treatment for all-cause mortality. For each pair-wise comparison of interventions, the 95% intervals for ORs were wide and included 1. The lower-left triangle of results in table 1 presents ORs estimated from direct evidence alone, while the upper-right triangle of results presents ORs estimated from the network meta-analysis. The intervals are slightly narrower when based on indirect as well as direct evidence rather than direct evidence alone. The SUCRA values presented in table 2 summarise the ranking information for all interventions. With respect to all-cause mortality, the highest SUCRA values are 69% for conformal LD radiotehrapy hypofractionated and 63% for conformal HD radiotherapy, indicating that these are most likely to be among the best treatments for this outcome. However, there is very high uncertainty in the rankings of the interventions, as indicated by wide 95% credible intervals.
Cancer-related mortality was reported in 16 trials, covering eight of the interventions. This was a rare outcome in most treatment groups, as expected for patients with localised prostate cancer with a 5-year end point. OR estimates had wide 95% credible intervals, particularly in comparison for which only indirect evidence was available, and there was no evidence of superiority for any of the comparator treatments (table 3). Based on direct comparisons alone, conformal HD radiotherapy was superior to conventional radiotherapy (OR 0.21 (95% interval 0.03 to 0.97)) and prostatectomy was superior to observational management (OR 0.60 (95% interval 0.37 to 0.98)).
GI and GU toxicity
Late GI toxicity was reported in 16 trials and late GU toxicity was reported in 15 trials. There was evidence that cryotherapy resulted in fewer adverse GI events than radiotherapy treatments (estimated ORs comparing cryotherapy against the five radiotherapy options ranged from 0.12 to 0.24, while all but one of the respective 95% credible intervals excluded 1). The SUCRA value of 99% for cryotherapy and the median rank of 1 (95% interval 1, 2) suggest that cryotherapy is almost certainly superior among the six treatments included in the network meta-analysis in relation to adverse GI events (tables 2 and 4). There was also evidence that GI toxicity was more likely with conformal HD radiotherapy than with conformal LD radiotherapy. Interpretation of such findings for toxicity should be more cautious than for the other outcomes, due to a concern that lack of blinding could have led to a risk of detection bias. For GU toxicity, there was no evidence favouring one intervention over another (table 5), although cryotherapy tended to receive better rankings than the five radiotherapy treatments (table 2), and the OR estimates favour cryotherapy, but the 95% intervals all included 1.
Using network meta-analysis, we were able to combine simultaneously all relevant evidence on treating patients with localised prostate cancer, even in the absence of direct comparative evidence for some treatment pairs, encompassing four efficacy and safety outcomes. Based on data from 21 trials including 7350 patients randomly assigned among eight different intervention regimes for localised prostate cancer, we found substantial uncertainty about the relative efficacy and safety of different interventions in respect of the studied outcomes.
Assumptions of consistency between direct and indirect evidence were tested to justify the joint synthesis of all studies; however, these tests had little power due to the relatively small number of trials available in most direct comparisons. Instead we must rely on judgements about the similarity of studies included in the analysis in aspects such as patient groups, outcome measures and study methodology. Although we defined the population of interest as patients with localised prostate cancer, there was heterogeneity between individual study populations in terms of the severity of disease. Some of the trials were conducted several decades ago, when surgery and radiology techniques may have been different, and we observed that stage migration has occurred in men diagnosed with prostate cancer due to emerging biomarker and image technologies. Furthermore, some of the trials used adjuvant therapy, although this was applied in all the arms within the trial.
Two further limitations warrant mention. Literature searches were completed in September of 2012. However, the results of one of the most important randomised trials—ProtecT study13—has not been published so far, and to our knowledge no other new relevant randomised controlled trials (RCTs) have been reported after this systematic review. Our choices of measurements may have favoured some treatments over others: for example, the RTOG scale had been used to define the late GI and late GU toxicity in the included studies, but it does not measure incontinence which could be the most common adverse event postprostatectomy.96
Methodologically, we used informative prior distributions based on external evidence for heterogeneity variances to increase precision in their estimation and improve estimation of treatment differences. Data-based informative priors have previously been considered by Lu and Ades,97 who used them for the between-study correlation structure. To our knowledge, our paper is the first application of network meta-analysis incorporating data-based informative priors for between-study heterogeneity.
Our findings have implications for research funding prioritisation and study design, and for clinical practice. The study identified particular ‘weak links’ in the network of comparative treatment options, which might be prioritised for future investment in RCTs. This is particularly the case for studies comparing HIFU (which currently is bereft of any comparative evidence) or brachytherapy against other treatment options, and also for trials examining the comparative efficacy and safety of prostatecotmy versus conformal radiotherapy modalities. For clinicians, and for men diagnosed with prostate cancer, our findings highlight that the optimal treatment options may be different in respect of different outcomes: patients need to be given appropriate information about the uncertainty surrounding treatment choice currently, and be allowed to opt for ‘trade-offs’ between efficacy and safety outcomes as they judge appropriately.98 Observational studies have consistently shown that radical prostatectomy has better cause-specific mortality outcomes compared with radiotherapy.99–103
In conclusion, clinically important information from high-quality randomised trials is still needed to inform decision-making regarding primary treatment options for men with localised prostate cancer. The findings of this study highlight the importance of informed patient choice and shared decision-making about treatment modality and acceptable trade-offs between multiple outcomes. The upcoming results of the ProtecT study,13 which is evaluating effectiveness of multiple therapies in men with PSA-detected localised prostate cancer, together with other treatment studies in progress, will hopefully contribute to the evidence base. It is, however, unlikely that evidential uncertainty about all relevant and important outcomes will be resolved by these trials, and an updated network meta-analysis incorporating new evidence may be useful to synthesise the new with the existing evidence. We demonstrate a high degree of uncertainty about treatment superiority in the management of localised prostate cancer. Clinicians and patients need to grapple with this uncertainty in the context of shared decision-making.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Data supplement 1 - Online appendix
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.