Objective Phase IV trials are often used to investigate drug safety after approval. However, little is known about the characteristics of contemporary phase IV clinical trials and whether these studies are of sufficient quality to advance medical knowledge in pharmacovigilance. We aimed to determine the fundamental characteristics of phase IV clinical trials that evaluated drug safety using the ClinicalTrials.gov registry data.
Methods A data set of 19 359 phase IV clinical studies registered in ClinicalTrials.gov was downloaded. The characteristics of the phase IV trials focusing on safety only were compared with those evaluating both safety and efficacy. We also compared the characteristics of the phase IV trials in three major therapeutic areas (cardiovascular diseases, mental health and oncology). Multivariable logistic regression was used to evaluate factors associated with the use of blinding and randomisation.
Results A total of 4772 phase IV trials were identified, including 330 focusing on drug safety alone and 4392 evaluating both safety and efficacy. Most of the phase IV trials evaluating drug safety (75.9%) had enrolment <300 with 96.5% <3000. Among these trials, 8.2% were terminated or withdrawn. Factors associated with the use of blinding and randomisation included the intervention model, clinical specialty and lead sponsor.
Conclusions Phase IV trials evaluating drug safety in the ClinicalTrials.gov registry were dominated by small trials that might not have sufficient power to detect less common adverse events. An adequate sample size should be emphasised for phase IV trials with safety surveillance as main task.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Strengths and limitations of this study
We provided a comprehensive descriptive assessment of the current portfolio of phase IV clinical trials evaluating drug safety in the ClinicalTrials.gov registry.
We employed logistic regression models to determine the factors associated with the use of blinding and randomisation in phase IV clinical trials which evaluated drug safety.
We followed a strict analysis process that was widely used in analysing the data from ClinicalTrials.gov to arrive at convincing results.
Some clinical trials were not registered in ClinicalTrials.gov.
There were some unavoidable missing data for certain data fields which might introduce some bias into the results.
Drug adverse reaction is a major global health concern accounting for more than 2 million injuries, hospitalisations, and deaths each year in the USA alone,1 and associated with billions of US dollars in cost every year in the developed countries.2 Although rigorous premarketing studies are required for all new drugs,3 ,4 the safety profile of a drug at the time of regulatory approval is often incomplete due to some characteristics of phase I–III trials such as limited sample sizes, short duration and strict inclusion/exclusion criteria.5 Approximately 20% of drugs acquired new black box warnings postmarketing, and 4% of the drugs were ultimately withdrawn for safety reasons.6 In 2007, the Food and Drug Administration was authorised by the Food and Drug Administration Amendment Act (FDAAA)7 to require postmarketing clinical trials to address safety concerns regarding a given drug. Compared to premarketing phase I–III trials, phase IV studies evaluate drug safety in a real-world setting, which may provide evidence to ensure or further refine the safety of approved drugs.5 ,8 ,9 However, little is known about the characteristics of contemporary phase IV clinical trials and whether these studies are of sufficient quality to advance medical knowledge in pharmacovigilance.
ClinicalTrials.gov is a public trial registry established by the National Library of Medicine on behalf of the National Institutes of Health (NIH) and was first launched in February 2000.10 Since 2005, the International Committee of Medical Journal Editors has implemented a policy requiring the registration of clinical trials as a prerequisite for publication.11 In addition, as of 2007, sponsors or their designees are obliged by FDAAA to register trials and report key data elements and basic trial results at ClinicalTrials.gov.12 Hence, the ClinicalTrials.gov registry is considered to be the most comprehensive source for clinical trial information worldwide.13–15 Harnessing this expansive resource will enable us to gain a deeper understanding of postmarketing drug safety surveillance.
The objective of our study is to examine the characteristics of registered phase IV clinical trials regarding drug safety and identify areas which require greater attention. We focus on data elements that are desirable for generating reliable evidence from clinical trials, including sample size and factors associated with the use of randomisation and blinding.
Our analysis was restricted to phase IV clinical trials registered with ClinicalTrials.gov between 2004 and 2014. A data set of 19 359 phase IV clinical studies registered with ClinicalTrials.gov was downloaded and locked from the website on 18 March 2015. A database was designed to facilitate analysis.15 ,16
Two authors (XZ and YZ) selected the eligible studies and summarised their results independently. Figure 1 shows the complete process of selection. Our analysis was restricted to phase IV clinical trials registered between 1 January 2004 and 31 December 2014 (n=18 642) according to the first date submitted to ClinicalTrials.gov. Interventional studies using drugs were identified by searching the sections of ‘study type’ and ‘intervention’ on ClinicalTrials.gov. Observational studies (n=981), expanded-access studies (n=10) and other studies that investigated ‘medical devices’, ‘vaccines’ or other products were removed (n=5878). On ClinicalTrials.gov, the ‘End point Classification’ section indicated the primary end point of the study, such as bio-equivalence, pharmacokinetics, safety and efficacy, and others. Additionally, based on the information in the ‘Primary Purpose’ section, studies could be divided into different groups: ‘Treatment’, ‘Prevention’, ‘Diagnostic’, ‘Supportive Care’, ‘Screening’, ‘Health Services Research’, ‘Basic Science’, ‘Educational/Counseling/Training’ and missing. We further identified studies whose purposes were ‘Treatment’ and primary end points were ‘Safety study’ or ‘Safety/efficacy study’ using ‘Primary Purpose’ and ‘End point Classification’ sections. Finally, 4722 eligible phase IV trials assessing drug safety alone or both safety and efficacy were included in our analysis.
The included trials were then categorised into three groups by different clinical specialties—mental health, oncology and cardiovascular diseases, using the information in the ‘Conditions’ section and the classification of studies both provided by ClinicalTrials.gov via matching the NCT number of each study.
Trial data were reported by the trial sponsors or investigators, as required by the ClinicalTrials.gov registry.17 Each record contained a set of data elements describing the study's conditions, enrolment, study design, eligibility criteria, location, sponsor and other protocol information.
The methods of defining derived variables have been described previously15 ,18 and are briefly summarised below. All trials were divided into six different groups by the funding sources according to the information in the ‘Sponsor_Collaborators’ and ‘Funded_By’ sections: NIH, industry, other, US federal (excluding NIH), university/college, hospital and other sources. The funding source was defined as the NIH if the lead sponsor or any collaborators were from the NIH, and the lead sponsor was not from industry. It was defined as industry if the lead sponsor was from industry or if any collaborators were from industry and none from the NIH. It was defined as from US federal sources if the sponsors were from US Federal only and none of the collaborators were from industry or NIH. The funding source was defined as ‘hospital’ if the lead sponsor was from a hospital or similar institutions and no collaborators were from industry, the NIH or a US federal. It was defined as ‘university/college’ if the lead sponsor was from a university, college or similar institutions and collaborator was not from industry, NIH, a US federal institution or hospitals. For the remaining studies, the funding source was defined as other sources. The start dates of trials could be obtained from the ‘Start_Date’ section. Information on the appointment of a data monitoring committee (DMC) became available since April 2007, and was not a required field.18 Thus, the DMC information was not considered in our study. The classifications of other variables were based on the information in the corresponding fields from ClinicalTrials.gov.
When a data field was incomplete, a web search (ClinicalTrials.gov) was conducted to find the missing information for the trial. If the information was not available on the website either, this field was identified as NA (not applicable) or missing. For studies reporting an interventional model of single group and the number of groups as 1, we inferred the value of allocation as non-randomised and the value of blinding as open if the information was missing.15 In addition, the allocation or blinding was reported as ‘Uncertain’ if single-arm trials were registered as randomised or blind.
The characteristics of the trials were assessed overall, by two end point classifications (safety only and safety/efficacy) and by three clinical specialties (mental health, oncology and cardiovascular diseases). The assessments included the study status, enrolment, intervention model, funding source and so on. The registration timeline of a trial was determined by comparing the date first received by ClinicalTrials.gov with the start date of the trial.
According to the binomial and Poisson distributions, if the adverse events (AEs) have a probability of occurrence 1%, 0.5% or 1%, the enrolment should be larger than 300, 600 or 3000, respectively (table 1), in order for the investigators to have a 95% chance to observe at least 1 case of AEs.19 Hence, we divided the included trials into five types: trials with sample size <300, between 300 and 599, between 600 and 2999 and 3000 or above and missing. Frequencies and percentages were provided for categorical characteristics; medians and IQRs were provided for continuous characteristics.
Logistic regression analysis was used to evaluate factors associated with the use of randomisation and blinding. A full model containing eight characteristics was developed and adjusted ORs with Wald 95% CIs were calculated for these factors. The factors assessed included funding source, primary purpose, number of participants, trial specialty (yes/no), trial start year before or after the publication of FDAAA in 2007 and end point classification (safety/efficacy study or safety study). Single-arm trials or studies with any of the data elements missing were excluded from the regression analysis.
SAS V.9.2 (SAS Institute) was used for all statistical analyses.
From 1 January 2004, to 13 December 2014, 18 642 phase IV trials were registered at ClinicalTrials.gov. Of these trials, 4722 phase IV trials related to drug safety were included in our study. Figure 1 shows the search process. The number of trials evaluating safety alone was 330, which was lesser than the number of trials evaluating both safety and efficacy (n=4392). A total of 594 trials (12.6%) focused on mental health diseases, 251 trials (5.3%) focused on oncology and 601 trials (12.7%) on cardiovascular diseases.
The basic characteristics of all inclusive 4722 trials registered with ClinicalTrials.gov are shown in table 2. The median number of participants per trial was 104.0 (IQR: 48.0–258.0). About 72.7% of these phase IV trials conducted randomisation and 44.4% used blinding (including double-blind and single-blind). We also noted that 8.3% (n=391) of these phase IV trials were ‘terminated’ or ‘withdrawn’, which means these trials were stopped for some reasons. Most of the 4722 studies were small (median enrolment: 35.5; IQR: 11.0–104.3). The most common research sites in these phase IV trials were from North America, Asia and the Pacific and Europe, which accounted for 34.4%, 28.2% and 26.5%, respectively.
Of the total phase IV trials 68.5% evaluating drug safety alone had enrolment of <300 patients, and only 3.9% (n=13) of them enrolled more than 3000. The median number of participants per trial was 104.0 (IQR: 45.0–392.0). The average sample size of the phase IV trials assessing both safety and efficacy was similar, with a median enrolment of 103.0 (IQR: 48.00–251.5). Compared with studies evaluating both safety and efficacy, phase IV trials focused on drug safety only showed larger proportion of studies using single group assignment (41.8% vs 25.9%) and a small proportion using randomisation (56.7% vs 74.0%). However, the difference in the proportion of studies using blinding was relatively small between trials focusing on safety only and those assessing safety/efficacy (34.0% vs 42.8%).
Table 3 showed the characteristics of the phase IV trials in three major therapeutic areas (cardiovascular, oncology and mental health). The cardiovascular diseases trials accounted for the most among these three categories (n=601, 12.7%). Also cardiovascular trials had more enrolment (median: 163; IQR: 70.0–400.0) than oncology trials (median: 100.0; IQR: 48.0–200.0) and mental health trials (median: 88.0; IQR: 40.0–226.0). Randomisation was less common in oncology trials than cardiovascular trials and mental health trials (43.0% vs 81.4% for cardiovascular and 67.5% for mental health). The difference in the use of blinding was similar (17.5% for oncology trials vs 46.2% for cardiovascular trials and 57.2% for mental health trials). As women-only trials, they accounted for the largest group for oncology trials at 13.5% compared to 1.3% for cardiovascular trials and 2.3% for mental health trials. It was noteworthy that nearly two-thirds of mental health trials (65.0%) excluded elderly patients. Geographical differences were also apparent. Mental health trials had the largest proportion of studies with at least one North American research site (52.9%), whereas, oncology trials showed the largest proportion of studies with at least one Asia and Pacific research site (42.2%). The NIH sponsored more mental health trials (8.9% vs 1.0% for cardiovascular trials and 0.4% for oncology trials).
Table 4 shows the results of the regression analyses. These analyses compared the trial characteristics that are related to the use of blinding and randomisation. A total of 1276 single-arm trials and 78 studies with any of the data elements missing were excluded from the regression analysis. Hence, there were 3361 trials which were considered in the regression model. Of these trials, 1950 (58.02%) studied were blind and 3234 (96.22%) were randomised. Different clinical specialties could affect the use of blinding and randomisation. Oncology trials were less likely to use both blinding (adjusted OR: 0.33; 95% CI 0.18 to 0.63) and randomisation (adjusted OR: 0.42; 95% CI 0.28 to 0.63). Mental health trials were more likely to implement blinding (adjusted OR: 3.35; 95% CI 2.56 to 4.38). Compared with the trials in which industry was the lead sponsor, the trials funded by universities or similar institutions were more likely to use blinding (adjusted OR: 1.32; 95% CI 1.08 to 1.60).
This study provided a descriptive assessment of the current portfolio of phase IV clinical trials evaluating drug safety. The characteristics of phase IV trials with different end point classifications and clinical specialties were compared. We also analysed the factors associated with trial quality. Thus, this study presented a unique opportunity to evaluate the landscape of phase IV trials related to drug safety and to identify areas of relative strength or weakness.
Small sample size was the greatest concern in phase IV trials involving the safety surveillance of an approved drug. Small phase IV trials might be used to evaluate the effectiveness of a given drug in a special patient subgroup, or in special situations.5 However, our study included only phase IV trials with ‘safety’ as an end point and most of these trials (77.6%) had an enrolment of <300. In the phase IV trials with safety as the primary end point, the average sample size was only 104. Thus, these small trials might not have sufficient power to detect AEs, especially less common AEs.19 Paying greater attention to the quality of phase IV trials may facilitate postmarketing drug safety surveillance. For trials with safety assessment as their primary purpose, the sample size should be estimated according to the probability of occurrence expected for each AE. For example, to observe an AE with an occurrence probability of 1.5%, the China Food and Drug Administration requires that the enrolment of phase IV trials focusing on drug safety should be more than 2000.20 For phase IV trials evaluating both efficacy and safety, the sample size should be calculated based on the effect sizes of efficacy and safety, respectively, and the study size should be determined by the larger one.
Phase IV clinical trials can have various designs and single-arm, non-randomised or open-label studies are accepted. If randomisation and blinding are feasible in the studies with controls arm, they can reduce bias and make evidence more reliable. Among the phase IV clinical trials with control, trials sponsored by a university or college were more likely to use blinding as compared to the phase IV clinical trials sponsored by industry. The methodological differences in trials were also evident among therapeutic areas. Oncology trials were less likely to use randomisation and blinding, which was consistant with the results of previous studies.15 One possible reason is that some of the oncology trials are conducted to investigate individualised or personalised treatment and randomisation or blinding is not feasible. Owing to the limitation of information on ClinicalTrials.gov, it is difficult to check whether all the phase IV trials with control are appropriately designed. However, the researcher should adopt randomisation and blinding when they are feasible.
Compared to prior analyses assessing the overall quality of the clinical trials landscape,15 our results showed some interesting findings. First, the Asia and Pacific area played a more important role in phase IV trials. Of the phase IV trials, 30.5% including the Asia and Pacific area, were a significant improvement over prior analyses of all clinical trials (13.5%).15 Including diverse populations could provide more information and help clinicians to ensure or refine the safety of approved drugs. Second, it was noted that the percentage of terminated or withdrawn phase IV trials was relatively high (8.6%). Califf's et al15 research revealed that 3.3% of all interventional clinical trials registered from October 2007 through September 2010 were terminated or withdrawn. We further analysed the conditions, end points and locations of the terminated or withdrawn phase IV trials but did not find any special characteristics other than small size (median: 38.0; IQR: 12.0–116.5). Third, the largest proportion of phase IV trials was funded by industry. Industry could use phase IV trials to expand the label of an approved drug or look for a completely new indication, which might be a potential explanation for the numerous small phase IV trials. However, the identification and characterisation of the risks associated with the prescription and use of medications are also essential and should be based on appropriate designs and sufficiently large sample sizes.
There are some inevitable limitations in this study. First, some clinical trials were not registered in the ClinicalTrials.gov registry, and these studies were not included in our analysis. However, ClinicalTrials.gov still accounts for more than 80% of all clinical studies in the WHO portal,15 so our analysis is broadly representative. Second, there were some missing data for certain data fields, which may introduce some bias into the results. Third, as described in the ‘Methods’ section, we used the end point classification field from the ClinicalTrials.gov registry to identify phase IV trials related to drug safety; however, we did not perform additional manual screening to specify the primary end point for trials evaluating both safety and efficacy.
We found that the phase IV trials enterprise related to drug safety in ClinicalTrials.gov were dominated by small trials with significant heterogeneity in quality. These findings raise questions about the capacity of the phase IV trials to supply sufficient amounts of high quality evidence for safe medication. Adequate sample size should be emphasised for phase IV trials with safety as the primary end point.
The authors gratefully acknowledge the valuable advice on revision from the reviewers. The authors also thank Jian Lu, PhD, for his assistance in designing the study. The authors acknowledge Meijing Wu and American Journal Experts, LLC for their professional copyediting service.
XZ and YZ contributed equally.
Contributors XZ and YZ contributed equally in conceiving this project, facilitating protocol, analysing data and drafting this manuscript. XY led the development of performance-based incentives and revised the manuscript critically. TZ and XG gave their time and effort to modify the programmes. JH provided expertise for the overall design of the study, and revised and approved the manuscript.
Funding This study was sponsored by the National Nature Science Foundation of China (number 81502895, 81373105), a grant from the key discipline for construction of evidence-based public health in Shanghai (number 12GWZX0602) and the Fourth Round of Three-year Action Plan on Public Health Discipline and Talent Programme: Evidence-based Public Health and Health Economics (number 15GWZK0901).
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement The analysed data set was upgraded on the Datadryad.org website. The title of the data set used in this revision is “phase IV clinical studies received by ClinicalTrials.gov between 2004 and 2014”. URL: http://datadryad.org/review?doi=doi:10.5061/dryad.3t6sc.