Article Text

Download PDFPDF

Quality indicators for responsible use of medicines: a systematic review
  1. Kenji Fujita,
  2. Rebekah J Moles,
  3. Timothy F Chen
  1. School of Pharmacy, Faculty of Medicine and Health, The University of Sydney, Sydney, New South Wales, Australia
  1. Correspondence to Kenji Fujita; kfuj2522{at}


Objective All healthcare systems require valid ways to evaluate service delivery. The objective of this study was to identify existing content validated quality indicators (QIs) for responsible use of medicines (RUM) and classify them using multiple frameworks to identify gaps in current quality measurements.

Design Systematic review without meta-analysis.

Setting All care settings.

Search strategy CINAHL, Embase, Global Health, International Pharmaceutical Abstract, MEDLINE, PubMed and Web of Science databases were searched up to April 2018. An internet search was also conducted. Articles were included if they described medication-related QIs developed using consensus methods. Government agency websites listing QIs for RUM were also included.

Analysis Several multidimensional frameworks were selected to assess the scope of QI coverage. These included Donabedian’s framework (structure, process and outcome), the Anatomical Therapeutic Chemical (ATC) classification system and a validated classification for causes of drug-related problems (c-DRPs; drug selection, drug form, dose selection, treatment duration, drug use process, logistics, monitoring, adverse drug reactions and others).

Results 2431 content validated QIs were identified from 131 articles and 5 websites. Using Donabedian’s framework, the majority of QIs were process indicators. Based on the ATC code, the largest number of QIs pertained to medicines for nervous system (ATC code: N), followed by anti-infectives for systemic use (J) and cardiovascular system (C). The most common c-DRPs pertained to ‘drug selection’, followed by ‘monitoring’ and ‘drug use process’.

Conclusions This study was the first systematic review classifying QIs for RUM using multiple frameworks. The list of the identified QIs can be used as a database for evaluating the achievement of RUM. Although many QIs were identified, this approach allowed for the identification of gaps in quality measurement of RUM. In order to more effectively evaluate the extent to which RUM has been achieved, further development of QIs may be required.

  • quality in health care
  • quality of care
  • quality indicators
  • performance measures
  • quality assurance
  • quality measurement

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • A comprehensive literature search was undertaken across seven databases and government agency websites without restriction of disease categories and care settings.

  • The classification of quality indicators (QIs) was based on multiple frameworks (eg, Donabedian’s framework, the Anatomical Therapeutic Chemical classification system and a validated classification for causes of drug-related problems) for maximum understanding and profiling of the included QIs.

  • Content validated QIs that were developed using consensus methods were only included, and therefore valid QIs might have been excluded during the screening process.

  • Although 5% of this review processes were verified by multiple authors to check for accuracy, most of the classification was undertaken by one author.


Responsible use of medicines (RUM) is an essential element in achieving quality of care for patients and the community. According to the WHO, RUM implies that the activities, capabilities and existing resources of health system stakeholders are aligned to ensure patients receive the right medicines at the right time, use them appropriately and benefit from them.1 RUM, however, is not easily achievable, and if medicines are used inappropriately, negative consequences for both patients and/or the society may occur. It is reported that worldwide more than 50% of all medicines are prescribed, dispensed or sold inappropriately, while 50% of patients fail to take them correctly.2 In addition, it has been reported that one-third of preventable drug-related admissions are associated with medication non-adherence, 31% are related to prescribing problems and 22% are related to monitoring problems.3 The frequency of these medication errors varies depending on the specific medicine. For example, previous systematic reviews have found that preventable drug-related admissions to hospital accounted for 3.7% of all admissions, of which four groups of drugs, antiplatelets, diuretics, non-steroidal anti-inflammatory and anticoagulants accounted for more than 50% of the drug groups associated with those preventable drug-related hospitalisations.3 From the economic perspective, globally, the cost associated with medication errors has been estimated at US$42 billion annually or almost 1% of total global health expenditure.4 Given the health concerns and the economic burden associated with medication errors, the achievement of RUM underpinned by an evidence-based approach has become increasingly important worldwide.

One critical element for any healthcare system or organisation is how to measure and evaluate RUM. A widely used method to do this is the use of quality indicators (QIs).5 6 QIs are explicitly defined and measurable items referring to the structures, processes or outcomes of care are usually described with a denominator and a numerator.7 The denominator is the total number of cases in the intended population, and the numerator is the number of cases that fulfil a predetermined criterion, and the calculated QI score indicates the quality of care.8 QIs can be used to monitor the quality of care provided by healthcare professionals in a single institution, to promote quality improvement activities, to make comparisons over time between institutions or to support consumers to choose healthcare providers.5 For QIs to be useful, they must be developed with scientific rigour, and all quality dimensions of care must be measured to capture a comprehensive landscape of healthcare quality.5

To achieve RUM using QIs, it is first necessary to identify existing QIs for RUM, independent of disease categories and care settings. Additionally, in the light of the concept of RUM, multifaceted assessment is required to gain full understanding of the breadth of coverage by QIs. To our knowledge, however, previously conducted systematic reviews have been restricted to setting (eg, hospital),9 disease state (eg, HIV/AIDS),10 specific to a healthcare group (eg, nursing sensitive QIs) or indicator name (eg, clinical indicators)11 and have only been classified based on Donabedian’s framework or implicit frameworks such as quality dimensions defined by the Institute of Medicine.12 Hence, the main purpose of this systematic review was to identify existing content validated QIs for RUM independent of disease category and care settings, and then classify them using multiple frameworks in order to identify gaps in current quality measurements.


Data sources

This systematic review was performed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement (see online supplementary table S1).13 Two approaches were used to identify relevant QIs.

Supplementary file 1

First, CINAHL, Embase, Global Health, International Pharmaceutical Abstract, MEDLINE, PubMed and Web of Science databases were searched to identify relevant articles published up to 5 April 2018. No restriction on year of study was applied. Search strategies comprised keywords and, when available, controlled vocabulary such as Medical Subject Headings/EMTREE based on three main terms: ‘quality indicators’, ‘development’ and ‘consensus’. Since ‘quality indicators’ are referred to by wide variety of terms such as clinical indicators, or performance measures, the finalised search strategies were developed using an iterative development process during which citations identified by various search terms were screened for relevance. We chose ‘consensus’ as a main term because QIs are recommended to be developed using expert panels based on rigorous evidence in order to ensure high face validity and content validity.14 Exact search dates for each database with the search strategies are included in online supplementary table S2.

Supplementary file 2

Second, using Google, an internet search was also conducted (search terms: quality indicators, clinical indicators, performance indicator or performance measures) to capture additional QIs listed in the websites of relevant organisations responsible for quality improvement. Potentially relevant organisation’s websites, found in the process of literature review,9 12 15–17 were also searched (see online supplementary table S3).

Supplementary file 3

Study selection

Inclusion criteria

Articles were included if they fulfilled the following criteria: (A) the article was peer reviewed and published in English, (B) numerators and denominators were defined for the QIs, or they could be directly deduced from the descriptions of the QIs, (C) the publication contained at least one medication-related QI, (D) the development of QIs was one of the objectives and (E) QIs were developed using consensus methods in order to confirm content validity. Furthermore, relevant organisations’ QIs found from websites were included if the organisation was a government agency for ensuring quality in healthcare, and at least one QI for RUM was reported with a clear description, as detailed above (B).

Given the concept of QIs and RUM mentioned above, we regarded a measurement tool as a QI for RUM when the definition of the QI referred to a medication. In addition, if publications concerned the same project/QIs set, the descriptions of the QIs in the most recent publication were used for data extraction.

Exclusion criteria

Articles were excluded if the consensus results for QI development were unclear, if QI lists were obtainable only by purchase or if QIs were for monitoring the effectiveness of national policies.

This study selection process was performed using a purposed designed screening proforma (see online supplementary table S4). The retrieved articles were transferred into Endnote to remove duplicates, then initial screening of journal names, titles and abstracts was conducted to remove irrelevant articles.

Supplementary file 4

Data extraction

One researcher (KF) extracted the following data from the full text of included articles or websites: publication year, country or other targeted location in which QIs were intended to be used, name of measurement tools, total number of QIs, the number of relevant QIs for RUM, scope of the QIs and definition of QIs (numerator and denominator, if available). A data extraction proforma was designed, pilot-tested on five included studies, then refined accordingly.


Descriptive statistics were computed for the results of the present review based on counts and proportions where relevant. Since the components of RUM are multidimensional, multiple frameworks were used to understand the breadth of coverage by QIs. That is, we used four types of classification: (1) problem type; (2) Donabedian’s framework; (3) the Anatomical Therapeutic Chemical (ATC) classification system; and (4) causes of drug-related problems (c-DRPs) classification system.

Problem type

The first step of a structured QI development process is to identify the problem for which measurement is needed.18 Classifying QIs according to problem type can highlight prioritised problems for QI development. Therefore, QI sets described in each source were classified into the following six problem types proposed by Evans et al 18:

  1. Disease based: problems relevant to diseases, illnesses, conditions, injuries or procedures for which the quality of care needs to be measured.

  2. Patient based: problems related to patient groups, such as vulnerable elders and paediatric patients.

  3. Treatment modality based: problems relevant to service providing areas, such as intensive care units or palliative care settings.

  4. Organisation based: problems relevant to organisational issues, such as whether organisations have effective structures in place at an organisational level to support quality and safety.

  5. Generic problems: problems relevant to issues that are multidisciplinary in nature and relevant to any form of healthcare delivery in multiple physical settings, such as falls prevention, or pain management.

  6. Profession based: problems unique to the different healthcare professions and include availability and competence of healthcare personnel.

If a QI set related to more than one problem type, they were classified accordingly (eg, an article about QIs for nursing practice in the operating room fell into treatment modality-based and profession-based problem).

Donabedian’s framework

QIs were classified according to the widely used Donabedian’s framework of structure (referred to the factors that designate the conditions under which care is provided, such as material or human resources), process (referred to the actions of healthcare professionals, such as prescribing or monitoring) or outcome (referred to the changes in individuals that can be attributed to care provided), irrespective of the category defined in the original source.19 Online supplementary table S5 lists examples of QIs classified into these three categories.

Supplementary file 5

The ATC classification system

QIs were first classified into medicine class specific indicators or general medication indicators, depending on whether the definition of the QI described a specific class of medicines. For example, a QI ‘numerator: patients with acute myocardial infarction (AMI) received aspirin within 3 hours of hospital arrival/denominator: AMI patients without aspirin contraindications’20 was classified as a medicine class specific indicator, while a QI ‘numerator: number of patients aged 65 years and older whose current medications are documented and reconciled at admission/denominator: number of patients aged 65 years and older in sample’21 was classified as a general medication indicator. After this process, medicine class specific indicators were classified using the first and second levels of the ATC code.22 A single QI was sometimes allocated into more than one ATC code. For example, a QI, ‘percentage of patients using opioids with concomitant laxatives’,23 represented A06 (drugs for constipation) and N02 (analgesics).

c-DRPs classification system

Since minimising the factors that contribute to drug-related problems (ie, causes of DRPs) is closely linked to achieving RUM, the extracted QIs were classified using a comprehensive taxonomy of the causes of DRPs.24 This taxonomy divides c-DRPs into the following nine categories.

  1. Drug selection, for example, whether appropriate drugs are selected by healthcare professionals.

  2. Drug form, for example, whether appropriate drug forms are selected by healthcare professionals.

  3. Dose selection, for example, whether appropriate drug dosages are selected by healthcare professionals.

  4. Treatment duration, for example, whether drugs are being prescribed or dispensed for an appropriate duration by healthcare professionals.

  5. Drug use process, for example, whether drugs are taken properly by patients.

  6. Logistics, for example, whether necessary drugs are properly delivered to the patients.

  7. Monitoring, for example, monitoring for the effect/adverse effects of drugs.

  8. Adverse drug reactions, for example, the occurrence of adverse drug reactions.

  9. Other.

Note that a single QI was sometimes allocated into more than one c-DRP category. Online supplementary table S6 illustrates how QIs were classified using the c-DRP taxonomy.

Supplementary file 6

All processes were conducted independently by one author (KF), and 5% of these processes were verified by TFC and RJM. Any issues that arose during the process were resolved by discussion between the research team (KF, RJM and TFC). Meta-analysis was not applicable due to heterogeneity in interventions, methods and reported outcomes. We believed that it was not necessary to assess the quality of the content validated QIs included in our studies such as their feasibility, and reliability because problems affecting QIs (eg, feasibility of data collection, reliability of calculating QI scores and opportunities for gaming) vary depending on the healthcare infrastructure and healthcare remuneration system in each country.

Patient and public involvement

As this was a literature review, there was no patient and public involvement in this study.


Study selection

Initially, a total of 39 430 articles were obtained. The sample included 17 822 duplicate records, which were removed. After the initial screening, 973 full texts were assessed for eligibility with 842 excluded based on the inclusion and exclusion criteria. Eventually 131 articles met all inclusion criteria and were included in our review. Additionally, through the internet search, five relevant websites were identified and included in our review (figure 1).

Figure 1

Study flow diagram. QI, quality indicator; RUM, responsible use of medicines.

Study characteristics

Of the 131 articles, 78 articles (60%) developed QIs for use in three countries: USA (n=36),25–60 Canada (n=26)61–86 and Netherlands (n=16).23 87–101 The remaining 53 articles developed QIs for use in 16 other countries20 21 102–145 and 4 other targeted locations (such as the Organisation for Economic Co-operation and Development (OECD) countries)146–152 (figure 2). Of the five relevant websites, three were Australian organisations,153–155 one was a UK organisation156 and the other was USA organisation.157 The three Australian and UK organisations developed QIs at the organisation level, while the American website, National Quality Measures Clearinghouse, sponsored by the Agency for Healthcare Research and Quality, stored QIs developed by various countries. Of 7750 QIs listed in the 131 articles and 5 websites, we identified 2431 QIs for RUM: 1947 QIs from journal articles and 484 QIs from the web.

Figure 2

The number of publications by country and other target location.

While there were 21 different ways of labelling the measurement tools, ‘quality Indicators’ (n=80, 59%) was the most commonly used term in our included articles and websites, followed by ‘quality measures’ (n=11, 8%), ‘quality of care indicators’ (n=8, 6%) and ‘indicators’ (n=7, 5%).

In terms of the problem type, 43% of QI sets pertained to disease-based problems (n=89, eg, knee osteoarthritis), then 27% for treatment modality-based problems (n=55, eg, primary care), 21% for patient-based problems (n=44, eg, geriatric care), 5% for profession based problems (n=11, eg, community pharmacists), 3% generic problems (n=6, eg, long-term prescribing) or 1% organisation-based practice (n=2, eg, centralised intake systems). The majority of QIs (n=2289, 94%) were process indicators, while structure (n=80) and outcome (n=62) indicators accounted for only 3% each (table 1).

Table 1

Characteristics of studies and quality indicator sets

Of 2431 QIs, 247 QIs (10%) were general medication indicators, and 2184 QIs (90%) were medicine class specific indicators. Some of the 2184 QIs represented more than one ATC code resulting in 2613 first level of ATC classifications. Of these, the most number of QIs covered medicines for nervous system (N, n=407, 16%), followed by the anti-infectives for systemic use (J, n=397, 15%), cardiovascular system (C, n=364, 14%) and blood and blood forming organs (B, n=345, 13%) (figure 3). Dermatological medicines (D) were covered by the least number of QIs (n=19, 0.7%) aside from antiparastic products, insecticides and repellents (P, n=7, 0.3%).

Figure 3

The number of QIs by first-level ATC code. ATC, Anatomical Therapeutic Chemical; QIs, quality indicators.

The distribution of the QIs across the second level of ATC code and c-DRPs classification system is presented in table 2. General medication indicators were only classified using c-DRPs category. Because some QIs represented more than one ATC code and/or c-DRPs category, the total number of the QIs contained within each cell of the matrix was 3666. Of these, when investigating the number of QIs in each c-DRPs category, the largest number of QIs for ‘drug selection’ pertained to antibacterials for systemic use (J01, 176 of 2117, 8%), followed by antithrombotic agents (B01, 172 of 2117, 8%). Antithrombotic agents (B01) also contributed the largest number of QIs for ‘dose selection’ (20 of 142, 14%) and the ‘drug use process’ (52 of 439, 12%) and ‘monitoring’ (52 of 574, 9%). Likewise, the most number of QIs for ‘treatment duration’ (13 of 85, 15%) pertained to psychoanaleptics (N06).

Table 2

Distribution of QIs for RUM by the ATC code (rows) and the c-DRPs category (columns)*

With regard to the c-DRPs classification system, the most common c-DRPs pertained to ‘drug selection’ (n=2117, 58%), followed by ‘monitoring’ (n=574, 16%) and the ‘drug use process’ (n=439, 12%). The remaining six c-DRPs categories accounted for only 14% of the QIs. Interestingly, only QIs for analgesics (N02) covered all nine c-DRPs categories. In terms of general medication indicators, the largest number of QIs covered ‘Logistics’ (n=73, 29%) among the c-DRPs category, which mainly focus on medication reconciliation problems during transitions of care, such as hospital admission and discharge.

A complete list of 2431 QIs is available in online supplementary table S7.

Supplementary file 7


The RUM is important for almost every healthcare setting in every country across the globe. Knowledge of whether medicines are being used in an optimal manner therefore presents a significant international challenge. In this systematic review, we identified 2431 QIs evaluating RUM and classified them using multiple frameworks. The large number of QIs reflects the multidimensional components of RUM and the different perspectives of multidisciplinary stakeholders involved in the RUM. The QI list presented in this review can be used as a comprehensive database and reference for existing content validated QIs pertaining to RUM. All stakeholders involved in quality assurance for RUM, for example, healthcare professionals, researchers and decision makers, can select QIs from the multicategorised QI list for their own purpose. Since healthcare systems and medication guidelines may vary between countries when using the QIs at the local setting, it is important for users to critically review the QIs for their acceptability, feasibility of acquiring necessary data, reliability, sensitivity to change, work load and validity.8 14

The vast majority of the QIs for RUM identified were intended to be used in only a few high-income countries. Low-income and middle-income countries, however, are estimated to have similar rates of medication-related adverse events, and the impact has been reported to be about twice as much in terms of the number of years of healthy life lost.4 Since feasibility of data collection for calculating QI scores in low-income settings remain a concern,151 further efforts for improving the data collection method might need to be made. We found that even though the role of all measurement tools (ie, QIs) relevant to RUM have the goal of quality improvement, the terminology used to describe QIs varied significantly. About 20 name variations were found, which reflects the absence of a universally accepted definition for such tools. For example, Campbell et al 8 distinguished QIs from performance indicators, arguing that QIs infer a judgement about the quality of care provided, while performance indicators are statistical devices for monitoring care provided to populations without any necessary inference about quality. However, we found that these terms, ‘quality’ and ‘performance’, were used interchangeably. Hence, further research for standardising the definition that distinguishes these measurement tools is warranted.

We also found a significant gap in terms of the problem type (eg, ‘disease-based problems’ (43%), ‘treatment modality-based problems’ (27%) and ‘profession-based problems’ (5%)). Since RUM is facilitated by collaboration in multidisciplinary teams, all healthcare professionals involved in medication treatment should take responsibility for quality assurance, regardless of diseases, care settings and professions. When using Donabedian’s framework, about 94% of the identified QIs related to processes of care. This could be because processes of care are easier to measure, and because process indicators can provide interpretable feedback about care provided.158 In contrast, there was a paucity of outcome indicators. This may be because multiple factors influence health outcomes, many of which are outside the control of individual healthcare professionals. In addition, the difficulty of obtaining sufficient information for assessing outcomes, requiring the linkage of multiple data sources, could be another reason of the limited number of outcome indicators. For outcome indicators to become more useful, multiple confounders such as patient demographic characteristics, and severity of illness, may need to be considered.159 Similarly, there was a low proportion of structural indicators. This may be because they are not sufficiently sensitive for monitoring ongoing performance and they have traditionally been used to monitor standards of healthcare facilities, not RUM.160 It is noteworthy that there is no set requirement for equal proportions of structural, process and outcome indicators in quality measurement. Instead, it is important to recognise the interconnectedness of these measures. For example, high structure indicator scores increase the likelihood of good process indicator scores, which in turn, may lead to higher outcome indicator scores.161 Further research is needed to investigate the associations between the identified QIs in each framework within healthcare settings.

We found large differences in the degree to which c-DRPs categories were covered by the identified QIs. Not surprisingly, ‘Drug selection’ accounted for more than half of the QIs, as choosing an inappropriate drug is the main cause of DRPs.3 162 Since focusing on limited c-DRPs categories may divert attention and resources away from other factors contributing to DRPs,163 164 users of QIs should be aware of what c-DRPs categories are not being measured. Like Donabedian’s framework, we do not expect that QIs should be evenly distributed across each of the c-DRPs categories or ATC groups. We do, however, expect that there will be greater QIs in areas of greatest need. These clinical areas may include common areas of practice suspected to be associated with inappropriate use of medicines and significant economic burden (eg, over use of antibiotics for upper respiratory tract infection and overuse of opioid analgesics). Use of QIs in these areas may fill the evidence–practice gaps and minimise subsequent DRPs.165 166

QIs for antithrombotic agents (B01) accounted for the larger proportion of QIs targeting ‘drug selection’, ‘dose selection’, ‘drug use process’ and ‘monitoring’ in c-DRPs categories. This may be explained by the fact that the majority of preventable drug-related admissions have been attributed to antiplatelets and anticoagulants, which have narrow therapeutic indices and high risk of overdose or toxicity,3 and also the fact that medication adherence to long-term antithrombotic therapy remains challenging.167 Likewise, QIs for psychoanaleptics (N06) accounted for the largest part of QIs targeting ‘treatment duration’. Since medication adherence is an ongoing challenge for consumers being treated for depression with antidepressant therapy, it seems appropriate that a relatively large number of QIs have been developed in these categories. In contrast, there were few QIs for some ATC groups, such as dermatological medicines. This has previously been reported in the literature for QIs as a whole, when comparing the scope of dermatology QIs to other medical specialty areas (eg, internal medicine, paediatrics or cardiology).168 This may be because dermatological medicines, especially topical agents, are relatively less harmful and less expensive. Since irrational topical dermatological medication can occur because of drug selection error and patients’ misunderstanding, prescribing, dispensing and administration errors,169 more QIs targeting the wide range of c-DRPs categories may need to be developed for ensuring RUM. Furthermore, when focusing on general medication indicators, QIs largely focused on ‘logistic’ issues such as medication reconciliation at transition points and unavailability of medicines in the c-DRPs category. This differed from medicine class specific QIs, which mainly focused on ‘drug selection’ issues. These differences underscore the importance of the combined use of general medication QIs and medicine class specific QIs for the comprehensive evaluation of RUM.

In terms of interpretation of direction of QI scores, we found different methods of scoring: those for evaluating whether necessary or appropriate care was provided and those for evaluating whether unnecessary or inappropriate care was provided. Therefore, care in the interpretation of QI scores is recommended as they have different interpretations based on positively or negatively worded indicators. We also found there were many similar QIs, with only minor differences in wording or definition. These slight differences may be attributed to feasibility of acquiring the data, differences in national guidelines, targeted populations or healthcare systems between locations or countries. However, these minor differences could adversely affect comparability of QI scores and could decrease motivation of healthcare professionals to participate in initiatives if they feel they are being asked the same indicator questions repeatedly. This may be overcome by undertaking a mapping exercise of the QIs identified in our review, with the potential of aggregating some of the QIs. QI is one of the measurement tools to evaluate quality of care at the healthcare facility or group level. QI scores do not directly represent quality of individual patient care but are used as ‘flags’ or ‘alerts’ to potential problems that require further analysis.170 In addition, actions required for quality improvement vary from the level of individual patients, healthcare providers, facilities or healthcare system. Therefore, a multidisciplinary, multilevel quality improvement initiative is needed for comprehensive quality assurance.

Strengths and limitations

Our review has some notable strengths. This is the first comprehensive review of QIs pertaining to RUM without restriction of disease categories and care settings. In order to do this, a comprehensive literature search was undertaken across multiple databases and websites. Moreover, the classification of QIs was based on multiple frameworks (eg, Donabedian and c-DRPs) for maximum understanding and profiling of the included QIs. The rich dataset of identified QIs can be used as a starting point for healthcare professionals, researchers, decision makers and others, for identifying and selecting existing QIs for the evaluation of RUM. We also identified significant gaps in current quality measurements in each framework, underscoring the need for further QI development in some areas. We do however acknowledge that our approach has some limitations. First, we only included QIs that were developed using consensus methods and excluded QIs if consensus results for QI development were unclear. Therefore, we might have excluded valid indicators during the screening process. Second, although 5% of this review processes were verified by multiple authors, our mapping exercise into the classification system may be viewed as subjective. Third, we identified QIs developed using consensus methods to ensure content validity; however, the methodological rigour of each study was not assessed. Therefore, the quality of the content validity of identified QIs was not reported.


Overall, by using multiple frameworks, we were able to identify and classify 2431 QIs covering different constructs of RUM. However, this review also pointed to some significant gaps in current quality measurements, making it difficult for healthcare systems to fully assess whether RUM has been achieved or not. The list of the identified QIs can be used as a database for evaluating the achievement of RUM. All stakeholders involved in quality assurance for RUM can select QIs from the multicategorised QI list for their own purpose. In order to more effectively evaluate the extent to which RUM has been achieved, further development and validation of QIs may be required.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.
  52. 52.
  53. 53.
  54. 54.
  55. 55.
  56. 56.
  57. 57.
  58. 58.
  59. 59.
  60. 60.
  61. 61.
  62. 62.
  63. 63.
  64. 64.
  65. 65.
  66. 66.
  67. 67.
  68. 68.
  69. 69.
  70. 70.
  71. 71.
  72. 72.
  73. 73.
  74. 74.
  75. 75.
  76. 76.
  77. 77.
  78. 78.
  79. 79.
  80. 80.
  81. 81.
  82. 82.
  83. 83.
  84. 84.
  85. 85.
  86. 86.
  87. 87.
  88. 88.
  89. 89.
  90. 90.
  91. 91.
  92. 92.
  93. 93.
  94. 94.
  95. 95.
  96. 96.
  97. 97.
  98. 98.
  99. 99.
  100. 100.
  101. 101.
  102. 102.
  103. 103.
  104. 104.
  105. 105.
  106. 106.
  107. 107.
  108. 108.
  109. 109.
  110. 110.
  111. 111.
  112. 112.
  113. 113.
  114. 114.
  115. 115.
  116. 116.
  117. 117.
  118. 118.
  119. 119.
  120. 120.
  121. 121.
  122. 122.
  123. 123.
  124. 124.
  125. 125.
  126. 126.
  127. 127.
  128. 128.
  129. 129.
  130. 130.
  131. 131.
  132. 132.
  133. 133.
  134. 134.
  135. 135.
  136. 136.
  137. 137.
  138. 138.
  139. 139.
  140. 140.
  141. 141.
  142. 142.
  143. 143.
  144. 144.
  145. 145.
  146. 146.
  147. 147.
  148. 148.
  149. 149.
  150. 150.
  151. 151.
  152. 152.
  153. 153.
  154. 154.
  155. 155.
  156. 156.
  157. 157.
  158. 158.
  159. 159.
  160. 160.
  161. 161.
  162. 162.
  163. 163.
  164. 164.
  165. 165.
  166. 166.
  167. 167.
  168. 168.
  169. 169.
  170. 170.


  • Contributors KF developed the review protocol and designed the review questions, carried out database search, articles screening, data extraction and classification and manuscript write up. RJM participated in protocol development, database search, articles screening, data extraction and classification and manuscript review. TFC participated in protocol development, conceptualising the review, designing review questions and database search, article screening, data extraction and classification and manuscript review.

  • Funding This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Patient consent Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Further details on studies included in this review can be retrieved by contacting the corresponding author.