Objectives Pharmacoepidemiological studies are an important hypothesis-testing tool in the evaluation of postmarketing drug safety. Despite the potential to produce robust value-added data, interpretation of findings can be hindered due to well-recognised methodological limitations of these studies. Therefore, assessment of their quality is essential to evaluating their credibility. The objective of this review was to evaluate the suitability and relevance of available tools for the assessment of pharmacoepidemiological safety studies.
Design We created an a priori assessment framework consisting of reporting elements (REs) and quality assessment attributes (QAAs). A comprehensive literature search identified distinct assessment tools and the prespecified elements and attributes were evaluated.
Primary and secondary outcome measures The primary outcome measure was the percentage representation of each domain, RE and QAA for the quality assessment tools.
Results A total of 61 tools were reviewed. Most tools were not designed to evaluate pharmacoepidemiological safety studies. More than 50% of the reviewed tools considered REs under the research aims, analytical approach, outcome definition and ascertainment, study population and exposure definition and ascertainment domains. REs under the discussion and interpretation, results and study team domains were considered in less than 40% of the tools. Except for the data source domain, quality attributes were considered in less than 50% of the tools.
Conclusions Many tools failed to include critical assessment elements relevant to observational pharmacoepidemiological safety studies and did not distinguish between REs and QAAs. Further, there is a lack of considerations on the relative weights of different domains and elements. The development of a quality assessment tool would facilitate consistent, objective and evidence-based assessments of pharmacoepidemiological safety studies.
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.
Statistics from Altmetric.com
This article reviews the suitability and relevance of available tools for the assessment of the quality of pharmacoepidemiological safety studies.
In the context of regulatory safety-related decision making, quality assessment (ie, assessment of the risk of bias), informs the evaluation of available evidence and enhances the appropriate utilisation of available evidence in assessing the balance between benefits and risks of drugs.
The development of a consolidated reporting and quality assessment tool would enhance the consistent, transparent and objective evaluation of pharmacoepidemiological safety studies. If a tool is developed, it is important to determine if there is a need for tools tailored for specific study designs or if one tool that consolidates these considerations might be helpful.
Key findings from our review of quality assessment tools include:
Many available quality assessment tools do not include critical assessment elements that are specifically relevant to pharmacoepidemiological safety studies.
Most tools do not distinguish between reporting elements (REs) and quality assessment attributes (QAAs).
There is a lack of reported considerations on the relative weights to assign to different domains and elements with respect to assessing the quality of these studies.
Strengths and limitations of this study
A priori creation of a pharmacoepidemiological safety study assessment framework.
Comprehensive review of the literature.
Importance for safety-related regulatory decision making.
Potential to leverage other comparable efforts in the comparative effectiveness research arena.
The purpose and scope of the reviewed tools varied greatly.
Each tool was reviewed by one reviewer.
Several sources of evidence on drug safety issues inform Food and Drug Administration (FDA) postmarketing safety-related regulatory decisions, including spontaneous case reports, registries, observational pharmacoepidemiological studies, randomised controlled trials (RCTs), meta-analyses and other sources. Despite the well-known strengths of RCTs in the assessment of drug efficacy, specific issues related to the design, methodology and transparency of experimental studies may limit their ability to fully characterise the safety profile of drugs after marketing approval.1–4 Pharmacoepidemiological studies, typically observational in nature, represent an important hypothesis-testing mechanism in the evaluation of drug safety issues suspected at the time of approval and for new signals emerging in the postmarket period. In contrast to RCTs, such studies, which typically employ broader inclusion criteria and leverage claims or electronic medical record data, might better reflect the real-life experience of patients. Furthermore, pharmacoepidemiological studies afford the ability to investigate rare drug-related adverse effects, examine risks in patient subpopulations, and assess long-term adverse events. Recent health-related legislation will increase the availability and adoption of electronic healthcare data for such studies.5 ,6
Despite the potential of pharmacoepidemiological safety studies to produce robust value-added data, the interpretation of findings from such studies can sometimes be challenging because of their well-recognised methodological limitations, including various sources of bias and confounding.7 These limitations also apply to the increasing number of comparative effectiveness epidemiological studies.8–10 The Institute of Medicine's recently published report highlights the importance of evaluating the quality of evidence and the significant scientific disagreements that have ensued over the quality of studies.11 Therefore, assessment of the quality of individual studies is essential to evaluating their credibility. Transparency in reporting on the design, conduct, analysis and results of these studies is a prerequisite for the assessment of the quality of the evidence; it is first necessary to understand the relevant aspects of the study design, conduct and analysis, along with the underlying assumptions and rationale behind the key scientific decisions undertaken by the study team to adequately evaluate the credibility of the study.12
The FDA recently published a draft guidance on the design, conduct and reporting of pharmacoepidemiological safety studies using electronic healthcare data that is designed both to enhance the transparency of reporting of such studies and to encourage critical scientific thinking regarding their design and conduct.13 In the future, this guidance may improve the credibility of submitted pharmacoepidemiological studies by shedding light on the pertinent aspects of studies needed to inform the evaluation of the internal and external validity of their findings. However, even if the bar for transparency and reporting of these studies is raised, it will still be necessary to evaluate the contribution of these studies to the available evidence on an emerging drug safety issue. The Grading of Recommendations Assessment, Development and Evaluation (GRADE) evidence assessment framework, used by clinical guideline developers, appropriately separates the initial processes of quality assessment and the weighing of evidence in the formulation of guideline recommendations.14 ,15 In the context of regulatory decision making concerning safety issues, the use of quality assessment tools to assess the risk of bias may add a measure of objectivity to the scientific judgment of the available evidence and improve the quality of decision making. The benefits may not only extend to improving decision making by regulators but also by journal editors and researchers as well as potentially improving the quality of performed studies by stimulating consideration of key aspects of these studies during the development of the study approach and protocol.
Although many checklists and scales for the assessment of epidemiological studies exist,16–18 most are not specifically designed to evaluate pharmacoepidemiological safety studies. Importantly, although the principles of epidemiology apply across different fields, there are unique challenges in the design, conduct and evaluation of epidemiological studies of unintended drug harms that warrant consideration of developing a specific validated assessment tool (eg, confounding by indication is an important challenge that is unique to epidemiological studies of drug effects). Recent articles have suggested the need to develop tools for assessing the quality of these studies.19–22 A recent publication found that systematic reviewers and meta-analysts are misusing reporting tools like Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) due to the dearth of validated assessment instruments.23
The main objective of this study is to critically evaluate the suitability and relevance of available tools for the assessment of pharmacoepidemiological safety studies. The ultimate goal is to stimulate discussion in the scientific community about the need for specific tools to facilitate the transparent, objective and consistent evaluation of study quality to inform safety-related regulatory decision making.
For purposes of this paper, quality assessment tools are defined as qualitative checklists and/or quantitative scales designed to facilitate assessment of the quality of epidemiological studies.
A priori quality assessment framework
To examine the utility of individual quality assessment instruments for the evaluation of pharmacoepidemiological safety studies, we created an a priori assessment framework, consisting of domains that include reporting elements (REs), and quality assessment attributes (QAAs) (table 1). Based on the expert opinion of FDA epidemiologists, concepts drawn from the FDA draft guidance on such studies, and key findings from seminal reviews tools and documents,13 ,24–28 we established the domains pertaining to the design, conduct and analysis of pharmacoepidemiological safety studies. Within each domain we listed critical elements that need to be considered for assessing the validity and interpretation of findings from such studies. We made a distinction between the REs and QAAs for each domain. This is an important distinction as some guidelines are strictly developed to discern and evaluate reporting whereas other tools are developed to evaluate quality, which requires assessment of reporting. The selected elements and attributes presented in this table are not intended to represent an all-inclusive list of factors, but rather to represent critical aspects impacting the internal and external validity of pharmacoepidemiological safety studies. Of note, although the QAAs necessarily involve some subjectivity, their inclusion in an assessment tool would facilitate the consistent and objective consideration and evaluation of key quality attributes across individual studies. As the GRADE framework developers have emphasised, although quality assessment is fundamentally subjective,29 developing a transparent, consistent approach to assessment of quality is important, especially in the regulatory and clinical arena as patients, healthcare professionals and sponsors benefit from consistent and transparent assessment of available evidence for use in decision making.
A comprehensive literature search of quality assessment checklists and scales was performed in MEDLINE, EMBASE and Web of Science. Search terms included: ‘assessment’, ‘tools’, ‘quality’, ‘medical research’, ‘evidence based research’, ‘evidence based medicine’, ‘meta-analysis’, ‘randomised controlled trials’, ‘biological product’, ‘drug’, ‘pharmaceutical preparation’, ‘biological therapy’, ‘bias’ and ‘epidemiology’. A total of 54 references were retrieved from this search. Two independent reviewers identified potentially relevant abstracts (n=26) from the initial literature review (inter-rater reliability >0.85). Inclusion criteria included quality assessment tools or relevant reviews developed to evaluate RCTs, observational studies or meta-analyses. Exclusion criteria consisted of clinical assessment tools, general articles or guidance on quality assessment and instruments or reviews focused strictly on reporting and not addressing quality assessment. After reviewing each paper, 10 relevant tools and 3 seminal reviews were identified; 13 were excluded based on the exclusion criteria above.
The most recent, relevant review articles and some individual tools for assessing the quality of epidemiological studies were identified.25–27 The 2007 Sanderson review,25 the most recent, comprehensive review of instruments for assessing quality of epidemiological studies, served as the starting point.
We also performed Google Scholar searches to identify relevant tools that might not be captured in the aforementioned search strategies. Google searches based on the first 50 hits included the following terms: ‘tool quality bias’; ‘quality assessment’; ‘pharmacoepidemiology’; ‘quality assessment epidemiology’; ‘tool quality assessment study’; and ‘scale quality assessment observational studies’. Furthermore, we identified and included an European Medicines Agency (EMA) methodological checklist24 because, although it is a reporting checklist, it includes domains and considerations designed to inform safety evaluations made at a drug regulatory agency.
The assessment tools identified were reviewed by investigators based on a priori assessment criteria shown in table 1. The percentage of tools assessing the prespecified elements and attributes within domains (percent representation) was tabulated. During our review, we documented which tools employed some method of validation.
Overall, out of 104 tools identified, a total of 96 distinct assessment tools, including 82 from the Sanderson review, 6 from the initial literature review, 7 from the Google search and 1 regulatory checklist (ENCEPP checklist) were considered for review (figure 1). Out of these, 61 were selected for the in-depth review.14 ,24 ,30–87 Tools exclusively focused on RCTs, tools focused on clinical assessments, and tools that did not include an explicit assessment framework were excluded (n=35).
Representation of a priori assessment domains and elements within tools
The proportion of reviewed tools that included REs and QAAs according to each a priori defined domain within the framework is shown in figure 1. Table 1 depicts the detailed results of our review of the domains, elements and attributes. We highlighted the representation of select RE and QAA under each domain that may have important implications for the assessment of a pharmacoepidemiological safety study. RE and QAA related to research aims were addressed in 69% (42/61) and 34% (21/61) of the tools, respectively. Regarding the domain assessing study population and data sources, 84% (51/61) of the tools included RE and 57% (35/61) included QAA (table 1).
In all, 61% (37/61) of the tools included RE and 31% (19/61) included QAA under the exposure definition and ascertainment domain (table 1). With respect to outcome definition and ascertainment domain, 69% (42/61) of the tools included RE and 36% (22/61) included QAA (table 1). Out of the 61 reviewed tools, 85% (52/61) and 49% (30/61) included RE and QAA under the analytic approach domain (table 1). Under the results domain, only 36% (22/61) and 7% (4/61) of tools included RE and QAA, respectively (table 1).
Of the 61 reviewed tools, 36% (22/61) and 20% (12/61) of tools included RE and QAA under the discussion and interpretation domain (table 1). Overall, 7% (4/61) of the tools addressed the description of the study team (RE) and the independence of team and funding sources (QAA).
More than half of the reviewed instruments considered REs for domains including research aims, analytical approach, outcome definition and ascertainment, study population and exposure definition and ascertainment. Domains related to the discussion and interpretation, results and study team were considered in less than 40% of the reviewed tools. With the exception of the study population/data sources domain, QAA research domains were considered in less than half of the assessment tools, with less than 10% considering results and study team domains. Many did not address all pertinent domains.
Most reviewed checklists and scales were not specifically designed to assess epidemiological studies of drug-related harms. Although the EMA framework was designed to increase transparency of pharmacoepidemiological studies, it focuses on reporting versus quality assessment. Our review constitutes the only recent comprehensive review of available assessment tools to determine if any are appropriate and sufficient for the evaluation of pharmacoepidemiological studies. Only a small number of the reviewed instruments employed some method of validation.30–36 Most of the tools did not differentiate between REs and QAAs whereas others stratified by these aspects. A small number included distinct assessment criteria for different epidemiological study designs (eg, case−control, cohort). Tools focused more on RE than QAA. Figure 1 displays the percentage of checklists and scales that included criteria on the assessment domains and elements.
The proportion of reviewed tools that included REs and QAAs according to each a priori defined domain within the framework is shown in figure 1.
Based on our review, there is no specific tool that is adequately designed for the robust evaluation of pharmacoepidemiological studies of drug safety. No single tool considered all the selected domains and elements and most tools failed to address critical evaluation elements. Making a distinction between RE and QAA is important as even if an element of the study is mentioned in the final report, one must then determine if this was appropriate for the specific study in the context of the drug safety question. Only a few instruments specifically made this distinction as we did in our a priori framework. Additionally, important RE and QAA were lacking in most of the checklists and scales we reviewed which highlights the need for a tool focused on the evaluation of epidemiological studies designed to evaluate drug-related harms; this need has been previously identified by others.88
Quality attributes related to exposure definition and ascertainment were considered in less than half of the assessment tools, with less than 15% including RE and QAA pertaining to the comparator group, despite the fact that the selection of a comparator is critical for drug safety and effectiveness trials and epidemiological studies, as the choice of suboptimal comparators can provide misleading results.89 ,90 Only 30% of the instrument included quality assessment elements pertaining to the validity and appropriateness of the operational aspects of exposure ascertainment, and only 36% addressed quality attributes of validation of outcome ascertainment approaches. These are important facets of pharmacoepidemiological safety studies their misclassification may lead to false-negative findings regarding the association between a drug and adverse event.
Only about 40% of the checklists and scales included QAA on approaches to handle confounding and biases. As observational studies are not randomised, the approaches to handle confounding and bias are of paramount importance.7 ,91 ,92 This is an important limitation of most tools because there are often uncertainties regarding results from pharmacoepidemiological studies due to the limitations of electronic healthcare data and complex nature of the practice of medicine.92 Only a small percentage of tools (28% RE; 18% QAA) included elements on the consideration of study findings in the context of the design, conduct, limitations and statistical power despite the fact that these elements are essential in assessing implications of study findings.
Some of the tools we reviewed were designed as ‘all-purpose’ assessment instruments for evaluation of clinical trials and observational studies, while others focused on a particular study design (eg, case−control, cohort). It may be useful to create one consolidated, validated tool for evaluating observational pharmacoepidemiological safety studies focused on general reporting and quality attributes; tools for the specific study designs, that is, case−control and cohort studies, may also be useful due to some of the unique aspects of these designs. By creating such a tool, regulatory agencies, clinical guideline developers, and clinicians could consistently evaluate studies and for decision making. The creation of this instrument could be led by an independent group of expert methodologists, perhaps with input from multiple stakeholders, including regulators and professional organisations.
Although we did not address weighing of importance of different domains and elements based on their relative impact on study contribution to the available streams of evidence, this may be an important consideration in the formulation of an assessment tool. Also, it is not clear if numerical scores are helpful in assessing the quality of epidemiological studies, as when numerical scores were used to evaluate systematic reviews or meta-analyses of such studies, they did not produce valid results.93 The appropriate tradeoff between the utility of a checklist or scale for review and the comprehensiveness of the evaluation elements has yet to be determined. This is complicated by the lack of validation of most of the available tools. Before these issues can be addressed, it is first necessary to engage in a broader discussion of the utility of such assessment tools in the evaluation of pharmacoepidemiological safety studies. It is worth noting that critical assessment elements of pharmacoepidemiological studies focused on effectiveness may be different than those focused on safety; however, pharmacoepidemiological comparative effectiveness studies focus on both comparative safety and benefits associated with drugs and thus such elements are not mutually exclusive.94 Thus, it is important to consider the potential to leverage current efforts to create a validated assessment tool (GRACE checklist95) for observational comparative effectiveness pharmacoepidemiological studies.
Our review has some limitations. The purpose and scope of the checklists and scales we reviewed, varied greatly. Although we conducted a comprehensive review, there may be tools that we were unable to access or that were published after our search. If a reporting or quality assessment element contained some aspects of the element, we counted this as full representation, even if not all the important sub-elements were included. Each checklist or scale was reviewed by one study team member; repeating the evaluations via a second reviewer was deemed unnecessary at this stage as the primary goal of the review was to obtain a broad understanding of the utility of available assessment tools in evaluating pharmacoepidemiological safety studies based on a preliminary assessment framework. Some factors that may be increasingly relevant in future studies, such as electronic health records with linkages to other data sources like outpatient claims, health information exchanges or personal health records, were not included in our framework but may be included in a future validated instrument. Guidelines and checklists published after the time period of our review have included some elements that may be important for future linked studies which may leverage the increasing availability of these data sources.96
In the evaluation of many emerging safety issues, pharmacoepidemiological safety studies are discussed and may influence safety-related decision making. However, often the quality-driven contribution of such studies is not discussed in a consistent way. The development of an assessment tool based on expert input may facilitate consistent, evidence-based quality assessment of such studies and the subsequent determination of their value based on evaluating the impact of bias on the robustness of a study results, and the interpretation of its findings, within the context of the specific drug safety issue. The framework we developed may serve as a foundation for future development of such an instrument. Efforts to improve the evaluation of the contribution of pharmacoepidemiological safety studies would be consistent with the FDA's focus on strengthening regulatory safety science.97 If after further consideration and discussions with stakeholders development of a tool to evaluate epidemiological data for drug safety is pursued, it would be necessary to first determine the scope of the assessment tool as well as steps for its comprehensive validation. Further, relevant aspects of the design and analysis of pharmacoepidemiology studies should be considered (we refer the reader to some helpful references).98–100 Importantly, such a tool would be intended to complement, and not replace, expert clinical, methodological and statistical expertise necessary to complete a robust evaluation and determination of the contribution of a specific pharmacoepidemiological safety study to the available evidence for regulatory decision making.
The authors would like to thank Gerald Dal Pan and Judy Staffa for their critical evaluation of the paper and thoughtful suggestions.
Contributors AGAN was involved in Substantial contributions to the conception and design, acquisition of data or analysis and interpretation of the data; drafting the article or revising it critically for important intellectual content; and final approval of the version to be published.
TAH was involved in Substantial contributions to the conception and design, acquisition of data or analysis and interpretation of the data; drafting the article or revising it critically for important intellectual content; and Final approval of the version to be published.
SPP was involved in Substantial contributions to the conception and design, acquisition of data or analysis and interpretation of the data;
drafting the article or revising it critically for important intellectual content; and final approval of the version to be published. DSI was involved in Substantial contributions to the conception and design, acquisition of data, or analysis and interpretation of the data;
drafting the article or revising it critically for important intellectual content; and final approval of the version to be published.
Disclaimer The views expressed in this article are those of the authors and do not necessarily represent those of the Food and Drug Administration.
Competing interests None.
Ethics approval As this study involved a review of existing assessment tools, a formal ethics review was not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.