Objectives We appraised the methodological and reporting quality of randomised controlled clinical trials (RCTs) evaluating the efficacy and safety of Chinese herbal medicine (CHM) in patients with rheumatoid arthritis (RA).
Design For this systematic review, electronic databases were searched from inception until June 2015. The search was limited to humans and non-case report studies, but was not limited by language, year of publication or type of publication. Two independent reviewers selected RCTs, evaluating CHM in RA (herbals and decoctions). Descriptive statistics were used to report on risk of bias and their adherence to reporting standards. Multivariable logistic regression analysis was performed to determine study characteristics associated with high or unclear risk of bias.
Results Out of 2342 unique citations, we selected 119 RCTs including 18 919 patients: 10 108 patients received CHM alone and 6550 received one of 11 treatment combinations. A high risk of bias was observed across all domains: 21% had a high risk for selection bias (11% from sequence generation and 30% from allocation concealment), 85% for performance bias, 89% for detection bias, 4% for attrition bias and 40% for reporting bias. In multivariable analysis, fewer authors were associated with selection bias (allocation concealment), performance bias and attrition bias, and earlier year of publication and funding source not reported or disclosed were associated with selection bias (sequence generation). Studies published in non-English language were associated with reporting bias. Poor adherence to recommended reporting standards (<60% of the studies not providing sufficient information) was observed in 11 of the 23 sections evaluated.
Limitations Study quality and data extraction were performed by one reviewer and cross-checked by a second reviewer. Translation to English was performed by one reviewer in 85% of the included studies.
Conclusions Studies evaluating CHM often fail to meet expected methodological criteria, and high-quality evidence is lacking.
- Chinese herbal medicine
- Quality of randomized controlled trials
- Traditional Chinese medicine
- Systematic review
- Rheumatoid arthritis
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
- Chinese herbal medicine
- Quality of randomized controlled trials
- Traditional Chinese medicine
- Systematic review
- Rheumatoid arthritis
Strengths and limitations of this study
It is the first study appraising methodological quality and adherence to reporting standards specifically in randomised controlled trials (RCTs) evaluating Chinese herbal medicines in patients with rheumatoid arthritis. No other studies have analysed the characteristics associated with a high risk of bias or poor adherence to reporting standards.
For methodological quality, we used the risk of bias tool, an instrument endorsed by the Cochrane Collaboration to facilitate improved appraisal of evidence. The tool has shown significant correlations with other appraisal tools.
For reporting adequacy, we used the Consolidated Standards of Reporting Trials (CONSORT) statement which is a minimum set of recommendations for reporting RCTs. It was developed by a team of international experts and is widely used worldwide and encouraged by the leading journals for the reporting of trials (over 50% of the core medical journals listed in the Abridged Index Medicus on PubMed—eg, BMJ, BMJ Open, New England Journal of Medicine and Journal of the American Medical Association).
We did not search the Chinese medical databases.
More than 85% of the studies were evaluated by the translator and cross-checked by another reviewer using the translation from the first reviewer.
Rheumatoid arthritis (RA) is one of the leading causes of disability worldwide, affecting 1% of the world population. According to the Global Burden of Disease 2010 study, years lived with disability from RA increased from 2 566 000 in 1990 to 3 776 000 in 2010.1 With ageing populations throughout the world, and declines in mortality, the number of people living with RA will increase substantially over coming decades.
Chinese herbal medicines (CHMs) have been used to treat RA for many years, mostly in Asia. The use of CHMs is generally based on experience and is influenced by a holistic concept of health.2 In traditional Chinese medicine (TCM), RA belongs to the category of ‘bi syndrome’; that is, it is believed to be caused by attacks of wind, cold, dampness or heat, which cause disharmony between bodily systems.3 ,4 Patients are classified as having a particular TCM syndrome according to their symptoms and then treated with CHMs to decrease inflammation by restoring the affected system or to ameliorate the side effects of disease-modifying antirheumatic drugs.5 TCM guidelines have been developed by the Chinese government for the diagnosis and treatment of these syndromes.6–8
There are numerous CHM preparations for the treatment of RA, including decoctions, whole plants, plant extracts and patented formulas,9–11 all of which can be given as a single herb or a mixture of herbs.12 Patented CHMs are often offered as chemical preparations, tablets and capsules with simple and convenient dosing schedules and reduced aftertaste.2 ,12 This has increased the acceptance of CHM in western countries. In 2002, a national survey conducted in the USA reported that almost 20% of adults had used herbal therapies in the past year.13 In the UK, a systematic review of 89 surveys on the use of complementary medicine showed that >50% of respondents with a chronic condition reported using this type of medicine during their lifetime.14 Furthermore, the use rate of herbal therapies in patients with arthritis in US primary care settings has been reported to be as high as 90%.15
After the introduction of evidence-based medicine in China, several randomised controlled trials (RCT) were conducted to evaluate the efficacy of CHMs for RA.10 ,12 ,16 ,17 However, the methodology used to conduct these trials was inconsistent, and the results were conflicting. Quality of reporting is intrinsically linked to the methodological quality of RCTs. Criteria for standardisation facilitate complete and transparent reporting and help to improve critical appraisal and interpretation of an RCT. Since the development of the Consolidated Standards of Reporting Trials (CONSORT) statement in 1996, most high-impact journals have endorsed its use to improve reporting of RCTs.18 In 2006, the CONSORT extension for reporting herbal medicines was developed, and in 2007, a draft of the extension for reporting TCM was released.19
To date, no systematic review has explored the study characteristics associated with methodological quality in controlled trials (randomised or not) evaluating the efficacy, effectiveness or safety of CHMs in the treatment of RA. The objectives of our study were to appraise the methodological quality of these studies by ascertaining potential risk of bias, to identify publication factors associated with methodological flaws and to determine the quality of reporting according to CONSORT recommendations.
We report our methods and results according to the Preferred Reporting Items for Systematic Review and Meta-Analysis statement.
We searched electronic databases (Medline, EMBASE, Cochrane Library and Web of Science) from inception through June 2015 for studies evaluating the use of CHM, including herbals or decoctions (eg, ‘tang’), in patients with RA (search terms are listed in online supplementary appendix 1). Our search was restricted to human studies and excluded case reports but was not limited by language, year of publication or type of publication. We also searched the reference lists of potentially relevant citations (controlled trials and reviews, although reviews themselves were later excluded from the analysis) to identify additional studies that were not published or otherwise found. EndNote X6 and DistillerSR were used to manage the records retrieved.
Study selection and eligibility criteria
Two reviewers (XP and PN) independently screened the titles and abstracts of all citations obtained by our searches. They resolved any disagreements through discussion and consensus. When no consensus was reached, a third party acted as an adjudicator (MAL-O). We included any RCT evaluating the efficacy, effectiveness or safety of CHMs in adult patients (age ≥18 years) with RA. All types of CHMs were considered: (1) patented medicines (pharmaceutical preparation or formulations) made from herbs (eg, tablets, liquids, granules, plasters, injections and capsules), (2) herbal decoctions (eg, ‘tang’) and (3) plants (whole or extracts). Any type of drug and placebo comparison and any follow-up duration were considered for inclusion. We excluded retrieved studies that were published before the year 2000, because the most currently used guidelines for reporting clinical trials (ie, CONSORT statement) were published in 1996, and we conservatively considered 4 years to be enough time for these guidelines to be disseminated and implemented.18 We also excluded studies published only as abstracts, studies with a non-RA control group and subanalyses of parent studies.
Data collection and outcome measures
One author (XP) extracted the data, which were then cross-checked by another author (MAL-O). A standardised extraction form was used to collect information about the characteristics of the RCTs and their participants, types of interventions, reported outcomes and sources of funding. Our primary outcome measures were the methodological and reporting quality of the RCTs.
Methodological quality in individual studies
The quality of each selected RCT was evaluated independently by two reviewers using the Cochrane risk of bias tool20 for RCTs published in English. RCTs published in non-English languages were translated and evaluated by one reviewer (XP) and cross-checked by another (MAL-O). In brief, each RCT was evaluated for its potential bias in five domains: selection, performance, detection, attrition and reporting. These domains specifically evaluate how the random sequence was generated, methods of allocation concealment, blinding of participants and personnel, blinding of the outcome assessment, how incomplete outcome data was handled and if there was evidence of selective outcome reporting. Each potential source of bias was graded as low, unclear or high and a justification for each judgement was provided.
Quality of reporting
We examined how closely the RCTs adhered to reporting standards using the CONSORT statement for TCM,21 which focuses only on CHM (ie, acupuncture, moxibustion, cupping and massage are not considered). It is a 23-item checklist, and the major recommendations for transparent reporting are (1) title and abstract should reflect the unique aspects of TCM, (2) rationale of formulation selection should be described, (3) diagnostic criteria should be specified for TCM and conventional medicine, (4) detailed information on the treatment and control interventions should be included, (5) the outcome in TCM terms should be included and (6) the ethics approval number and trial registration number should be included.
Summary measures and synthesis of results
Descriptive statistics were used to report RCT and participant characteristics, as well as the methodological quality of the RCTs. Risk of bias assessment was summarised per domain. Bivariate analysis was used to compare RCT characteristics according to the risk of bias judgement. Univariate and multivariable logistic regressions were performed to determine the factors associated with high or unclear risk of bias in the five domains in RCTs. We combined the unclear and high risk of bias categories for the analyses. Evidence suggests that the magnitude of treatment effects may be similar in studies appraised as having high or unclear risk of bias, but not for studies assessed as low risk of bias.22 The variables tested as predictors were year of publication, sample size, number of authors, publication language (English or non-English), reporting or disclosing of funding (yes or no) and setting (academic or non-academic). Variables with a univariate p<0.15 were initially included in a multivariable logistic regression model and reduced using the stepwise selection method. Associations were described using ORs and their associated 95% CIs. We categorised each CONSORT TCM checklist item as reported or not reported.21 ,23 We then summarised our findings in five sections: title/abstract, introduction, methods, results and discussion. Subgroup analyses were performed to compare rates of potential bias (low risk or unclear/high risk, unclear vs high risk) and adherence to reporting guidelines according to (1) publication before or after the Cochrane risk of bias tool was released and (2) publication before or after the CONSORT statement for TCM was released. SAS V.9.3 (SAS Institute, Cary, North Carolina, USA) was used to carry out the computations for all analyses. Apart from the univariate analysis, p<0.05 was considered statistically significant.
Out of 2342 unique citations, 232 full-text articles were retrieved to assess for eligibility. Of those, 119 RCTs were included5 ,24–141 evaluating 18 919 patients. Figure 1 shows the number of studies screened, assessed for eligibility and included in the review, with reasons for exclusion at each stage.
Characteristics of the included studies
Table 1 shows the aggregated characteristics of the included RCTs, and online supplementary table S1 shows the characteristics of the individual studies. Two-thirds of the RCTs were published before the CONSORT statement for TCM and herbal interventions were published (2000–2007). One RCT was conducted in the USA, one in Korea and the rest in China. Most were single-centre studies and were not indexed in the Web of Knowledge database but were indexed in the China National Knowledge Infrastructure database (CNKI).
Outcomes reported varied across RCTs. More than half of the RCTs assessed efficacy (eg, disease activity score either 28143 or 44;144 20%, 30%, 50% or 70% improvement, according to American College of Rheumatology criteria,145 in tender or swollen joint counts, physician global assessment or grip strength; or good to moderate improvement according to European League Against Rheumatism criteria),146 laboratory outcomes (eg, erythrocyte sedimentation rate, C reactive protein, rheumatoid factor or anticyclic citrullinated peptides) or adverse events. However, <56% of RCTs reported on patient-reported outcomes (eg, pain, patient global assessment, health assessment questionnaire or morning stiffness), and only 3% reported radiographic outcomes (eg, Sharp score, erosion, joint narrowing, marked radiographic progression or no progression). Thirty-one RCTs (26%) reported improvement of symptoms but did not provide additional details on the type of symptoms or how these symptoms were assessed.
Characteristics of the participants and interventions
Most participants included in the RCTs met the 1987 American College of Rheumatology (ACR) diagnostic criteria for RA.147 One RCT included only patients meeting the 2010 ACR classification criteria148 and in another, patients with RCT included met either the 1987 or 2010 criteria. In addition, 52.5% of the participants met the criteria for one or more of the traditional TCM ‘pathological factors’ or syndromes (ie, feng (wind), shi (damp), tan (phlegm), re (heat) and qi or yin deficiency). The most common pathological factors reported are listed in table 2. A total of 10 108 patients received a single CHM and 6550 received one of 11 treatment combinations. Of those receiving combination treatment, 5061 patients received combinations that included CHMs (with disease-modifying antirheumatic drugs, non-steroidal anti-inflammatory drugs, steroids or antibiotics). More than half of the CHMs were individualised preparations targeting pain relief and improvement in joint function. In the control groups, 1402 patients received disease-modifying antirheumatic drugs alone, (ie, methotrexate, leflunomide, sulfasalazine or etanercept), 644 received non-steroidal anti-inflammatory drugs alone and 165 received an inert placebo. In 35 studies, patients were described as having active disease, and two RCTs included patients with refractory disease, one included patients with early RA, three included patients at intermediate stages of RA and one included patients with RA and anaemia. Discontinuation rates were not reported in 68 RCTs, but in those that reported discontinuation rates, they ranged from 0% to 55%.
supplementary figure S1 summarises the results across RCTs. A high or unclear risk of bias was observed across all domains. When evaluating selection bias, we found that 29% of the RCTs did not report sufficient detail to evaluate methods of random sequence generation or in 31% allocation concealment (judged unclear). In addition, 21% were judged to have high risk for selection bias (11% from sequence generation, 30% from allocation concealment). Risk of performance bias (not blinding participants or personnel) was judged to be high in 85% of the RCTs, and detection bias (blinding of assessment of primary outcome) was judged to be high in 89% of the RCTs. More than two-thirds of the RCTs did not provide sufficient detail to judge the risk for attrition bias and lacked data on withdrawal rates, power calculation and how missing data was handled. From the remaining RCTs providing details to evaluate attrition bias, 4% were judged to be of high risk. Risk of reporting bias was high in 40% of the RCTs, and 86% of the RCTs did not report the source of funding or included a disclosure statement.
RCT determinants associated with high risk of bias
Characteristics observed in RCTs according to risk of bias are shown in online supplementary table S2. In the univariate analysis, earlier year of publication, fewer authors, funding source not reported or disclosed, publication in a language other than English, authors from non-academic settings and no power calculation reported were associated with high or unclear risk of bias in various domains (table 3). After adjustment for covariates, the following associations remained in the multivariable analysis: (1) earlier year of publication and funding source not reported or disclosed were associated with high or unclear risk of selection bias (sequence generation); (2) fewer authors were associated with high or unclear risk of selection bias (allocation concealment), performance bias and attrition bias and (3) publication in a language other than English and funding source not reported or disclosed was associated with high or unclear risk of reporting bias. Logistic regression analysis could not be performed for detection bias owing to the small number of RCTs in the low risk-of-bias group (n=2).
Adherence to CONSORT standards for TCM
Rates of adherence to CONSORT standards for TCM are shown in table 4. Most of the RCTs (98%) stated the objective adequately. However, most RCTs (94%) did not include the recommended information (specification of interventions, name of disease and study design) in the title or abstract. In the introduction sections, more than half (61%) did not provide the three names of the compound formulation (Chinese, Latin and English) as recommended by the WHO. The methods sections of most RCTs were poorly reported (table 4). Most RCTs (ranging from 85% to 98%) failed to describe in enough detail the interventions, type of study design, calculation of sample size or methods of randomisation and blinding. In the results sections, many RCTs (78%) did not indicate how participants moved through the study over time or provide a flow diagram as recommended, and 93% of RCTs did not include intention-to-treat analysis. The discussion sections were compromised in 8–83% of the RCTs owing to a lack of general interpretation of the results and conflict of interest information.
When comparing the risk of bias from the 90 RCTs published before the Cochrane risk of bias tool was released in 2008 (year of publication 2000–2008) with that of the 29 RCTs published later, we observed improvement in RCTs published after 2008 in the following domains: selection bias (22% compared with 59%, p=0.0004), attrition bias (18% compared with 52%, p=0.0005) and reporting bias (41% compared with 66%, p=0.02). When comparing adherence to reporting guidelines from the 80 RCTs published before the CONSORT-TCM statement was released in 2007 (publication date 2000–2007) with that of the 39 RCTs published later, we observed improvement in RCTs published after 2007 in most of the items except in reporting sufficient details about the objectives of the study, design, power calculation and methods to avoid bias (data not shown). We also evaluated determinants associated with high risk of bias, compared with unclear risk of bias. In most domains there were no differences. We observed differences in allocation concealment, but the characteristics associated with high risk of bias were the same as those observed with the main comparison high/unclear versus low risk of bias. There were some characteristics not observed in the main comparison associated with detection bias (see online supplementary table S3).
Our results indicate that serious methodology and reporting flaws still exist in clinical trials evaluating the effect of CHM in RA. We found that the potential for selection bias was high; two-thirds of the RCTs in our analysis lacked sufficient detail on how the random sequence was generated. Similarly, most RCTs were not blinded (eg, for participants, personnel or outcome assessment), thus increasing the potential for performance and detection bias. Risk of attrition bias was high or unclear in more than two-third of the RCTs. Reporting bias was also judged to be high or unclear in half of the included RCTs owing to a lack of study protocol and/or reporting less than the minimum number of outcome measures recommended to be included in RA clinical trials.149 Furthermore, we found that RCTs that were older, had fewer authors, did not report or disclose funding, were published in a language other than English or were written by authors from non-academic settings, were likely to have a high or unclear risk of bias methodology in at least one of the evaluated domains.
We also found that adherence to reporting standards remains a concern. Title and abstract, introduction and methods were even more problematic. Owing to these weaknesses, readers are not provided with clear and transparent information on the interventions or methods to assess bias in selecting, blinding and evaluating participants. Other areas of concern were lack of appropriate general interpretation of the results in the context of current evidence and incomplete descriptions of conflicts of interest.
It is important to differentiate between the two main concepts reported in this review: methodological quality and quality of reporting. One of the main components of evidence-based medicine is the use of available literature to improve decision-making. Stronger inferences can be drawn from studies in which measures have been taken before, during and after the intervention to prevent random and systematic bias. Although RCTs are ranked high in the hierarchy of evidence, not all RCTs share the same quality, which can lead to biased results. Quality of reporting is a separate concern. A lack of complete and transparent reporting of the processes and findings of an RCT is commonly linked to methodological flaws. Poor reporting leaves out critical information needed to judge study safeguards against bias.
A few systematic reviews evaluating the efficacy of CHM and reporting on the methodological quality of the studies have shown similar results. However, none of these studies evaluated CHMs exclusively (ie, excluding other TCM) in patients with RA. Nonetheless, a review evaluating the characteristics of 89 studies published between 2000 and 2003 and indexed in the CNKI reported a lack of unified diagnostic and evaluating standards.150 Another review evaluating 20 RCTs published between 2000 and 2010 showed that methodological quality according to the Jadad scale was generally low, with an average quality score of 1.2 out of 5.151 Contrastingly, a Cochrane review evaluating herbal medicines in general (including three CHMs) showed that the quality of the studies has improved since 2000, but the risk of bias across different domains was variable.152 Furthermore, an overview of 31 reviews published between 1999 and 2009 evaluating several TCM approaches (including CHMs) for multiple diseases showed that methodological quality improved over the years, although many issues remained, specifically a high risk of selection (inadequate randomisation methods), attrition (small sample sizes and high withdrawal rates) and reporting (selective reporting of outcomes) biases.153
To the best of our knowledge, the current study is the first appraising methodological quality and adherence to reporting standards specifically in RCTs evaluating CHM in patients with RA. No other studies have analysed the characteristics associated with a high risk of bias or poor adherence to reporting standards. For methodological quality, we used the risk of bias tool, an instrument endorsed by the Cochrane Collaboration to facilitate improved appraisal of evidence. The tool has shown significant correlations with other appraisal tools.154 ,155 For example, the risk of bias tool was shown to accurately identify trials that may have overestimated treatment effects. Studies shown to have a high or unclear risk of bias according to the risk of bias tool have larger effect estimates than studies shown to have a low risk of bias.22
Our study has certain limitations. First, as with any systematic review, it was constrained by the available data. For example, a protocol or trial registration was not found for most of the RCTs included in our review. This is concerning because the validity of the conclusion of an RCT is largely based on adherence to prespecified methods (including outcomes of interest). Trial registration could improve transparency and help identify gaps in knowledge, prevent unnecessary duplication in clinical trials and improve adherence to international quality standards.156 Second, we included studies with substantial variation in characteristics (eg, RA diagnosis, TCM syndromes and CHM descriptions). However, this allowed us to evaluate a larger number of RCTs. Third, we did not search the CNKI, Chinese Medical Current Contents (CMCC) and Wanfang Data databases, which could have increased the number of publications included in this review, because they are not widely available outside Asia, and we felt that there was no clear reason to consider that the quality of these studies would have been better.157 Finally, independent quality assessment for non-English language articles could not be performed. Only one reviewer could translate Chinese articles; therefore, more than 85% of the studies were evaluated by the translator and cross-checked by another reviewer using the translation from the first reviewer.
In summary, our results indicate that trials of CHM for the treatment of RA often fail to meet expected methodological criteria, and high-quality evidence is lacking. Because clinical trials are just below systematic reviews in the hierarchy of evidence and are used to endorse recommendations by health organisations, more attention is needed to improve the methodological robustness of these studies. Future clinical trials evaluating CHMs in RA should be designed, conducted and reported according to current specifications and principles.
We are grateful to Yimin Geng from the Research Medical Library of The University of Texas MD Anderson Cancer Center for helping with the terms included in the search strategies for the electronic databases and to Ms. Pratibha Nayak for her contributions during the selection of the studies.
Twitter Follow Maria Suarez-Almazor @msalmazor
Contributors MES-A had full access to all of the data in the study and takes responsibility for the integrity and the accuracy of the data analysis. MES-A and MAL-O conceptualised and designed the study. GP and Geng were responsible for the search strategy. XP, Nayak and MAL-O were responsible for selection of the studies. XP, MAL-O contributed in quality appraisal and data extraction. JS, XP, MAL-O and MES-A analysed and interpreted the data. XP, MAL-O, JS and MES-A drafted the manuscript. XP, MAL-O, JS, GP and MES-A critically revised the manuscript for important intellectual content. MES-A provided administrative, technical or material support and supervised the study.
Funding The statistical analysis in this research (through the Biostatistics Resource Group) was supported in part by a Cancer Center Support Grant from the National Cancer Institute (P30CA016672) to the University of Texas MD Anderson Cancer Center.
Competing interests We have read and understood the BMJ Open policy on declaration of interests and declare the following interests: (1) MES-A was the recipient of a K24 career award from the National Institute for Musculoskeletal and Skin Disorders; (2) XP's work is supported by the Shanghai Municipal Education Commission and the Shanghai Shuguang Hospital, Shanghai University of TCM and (3) MAL-O is the recipient of a career award from the Rheumatology Research Foundation and has received a consulting fee from Complete HEOR Solutions outside the scope of the submitted work. All authors have completed the Unified Competing Interest form at http://www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author).
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.