Article Text

Potentially inappropriate prescribing (PIP) in long-term care (LTC) patients: validation of the 2014 STOPP-START and 2012 Beers criteria in a LTC population—a protocol for a cross-sectional comparison of clinical and health administrative data
  1. Lise M Bjerre1,2,3,4,
  2. Roland Halil1,5,
  3. Christina Catley4,
  4. Barbara Farrell1,2,6,
  5. Matthew Hogel1,2,
  6. Cody D Black1,2,
  7. Margo Williams2,
  8. Cristín Ryan7,
  9. Douglas G Manuel1,2,3,4,8
  1. 1Department of Family Medicine, University of Ottawa, Ottawa, Ontario, Canada
  2. 2C.T. Lamont Primary Health Care Research Centre, Bruyère Research Institute, Ottawa, Ontario, Canada
  3. 3School of Epidemiology, Public Health and Preventive Medicine, University of Ottawa, Ottawa, Ontario, Canada
  4. 4ICES@ uOttawa, Ottawa, Ontario, Canada
  5. 5Bruyere Academic Family Health Team, Ottawa, Ontario, Canada
  6. 6School of Pharmacy, University of Waterloo, Waterloo, Ontario, Canada
  7. 7School of Pharmacy, Royal College of Surgeons in Ireland, Dublin, Ireland
  8. 8Ottawa Hospital Research Institute, Ottawa, Ontario, Canada
  1. Correspondence to Dr Lise M Bjerre; lbjerre{at}


Introduction Potentially inappropriate prescribing (PIP) is frequent and problematic in older patients. Identifying PIP is necessary to improve prescribing quality; ideally, this should be performed at the population level. Screening Tool of Older Persons’ potentially inappropriate Prescriptions/Screening Tool to Alert doctors to Right Treatment (STOPP/START) and Beers criteria were developed to identify PIP in clinical settings and are useful at the individual patient level; however, they are time-consuming and costly to apply. Only a subset of these criteria is applicable to routinely collected population-level health administrative data (HAD) because the clinical information necessary to implement these tools is often missing from databases. The performance of subsets of STOPP/START and Beers criteria in HAD compared with clinical data from the same patients is unknown; furthermore, the performance of the updated 2014 STOPP-START and 2012 Beers criteria compared with one another is also unknown.

Methods and analysis A cross-sectional study of linked HAD and clinical data will be conducted to validate the subsets of STOPP/START and Beers criteria applicable to HAD by comparing their performance when applied to clinical and HAD for the same patients. Eligible patients will be 66 years and over and recently admitted to 1 of 6 long-term care facilities in Ottawa, Ontario. The target sample size is 275, but may be less if statistical significance can be achieved sooner. Medication, diagnostic and clinical data will be collected by a consultant pharmacist. The main outcome measure is the proportion of PIP missed by the subset of STOPP/START and Beers criteria applied to HAD when compared with clinical data.

Ethics and dissemination The study was approved by the Ottawa Health Services Network Research Ethics Board, the Bruyère Continuing Care Research Ethics Board and the ethics board of the City of Ottawa Long Term Care Homes. Dissemination will occur via publication, national and international conference presentations, and exchanges with regional, provincial and national stakeholders.

Trial registration number NCT02523482.


This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.



Potentially inappropriate prescribing (PIP), defined as the use of medicines whose potential harms may outweigh the benefits,1 or the omission of potentially beneficial medication, is frequent and associated with significant morbidity and mortality, particularly in long-term care (LTC) residents and the frail elderly. The elderly (aged 65 years and over) are vulnerable to adverse drug events (ADEs) due to the changes in physiology that occur with increasing age and disease,2 ,3 and ADEs contribute significantly to emergency department (ED) visits, unplanned hospitalisations,4 and in-hospital morbidity and mortality.5 A recent study showed that, of 600 elderly residents admitted to hospital for an acute illness, 25% of them had one or more ADE prior to hospitalisation, of which two-thirds had contributed to the hospitalisations.6 Of these events, 69% were deemed avoidable.

The likelihood of PIP increases as people are prescribed more drugs (referred to as polypharmacy) and it is often associated with increased costs.7–9 A number of tools have been developed to identify PIP in clinical settings, including the STOPP/START (Screening Tool of Older Persons’ potentially inappropriate Prescriptions/Screening Tool to Alert doctors to Right Treatment) and the Beers criteria.10 ,11 According to an overview of clinical medication assessment tools,12 an earlier version of the STOPP/START criteria11 ,13 was considered ‘most promising’ compared with a number of existing tools to identify PIP; however, like other clinical tools, they are fairly time-consuming and therefore expensive to use.

The STOPP/START criteria

The STOPP/START criteria were developed through consensus by a team of geriatricians, pharmacists, pharmacologists and primary care physicians, published in 200813 and updated in 2014.11 The STOPP criteria includes drugs to avoid in the elderly, those that increase risk of falls, drug–drug and drug–disease interactions, and duplicate class prescriptions. The START criteria include drugs that should be used in the elderly based on evidence for impact.11 ,14–16 The 80 STOPP and 34 START criteria are grouped by physiological system (cardiovascular system, central nervous system, etc), and each criterion is accompanied by a brief rationale as to why a particular medication or combination of medications is considered potentially inappropriate or appropriate.11

The STOPP/START criteria were successfully applied to patient profiles in a number of settings, including primary care clinics, hospitals and nursing homes.14 ,17 ,18 They have been shown to significantly reduce PIP compared with ‘usual care’ in a randomised controlled trial of elderly hospitalised residents in whom the quality of prescribing was assessed both at baseline and 6 months after applying STOPP/START as the intervention to improve prescribing practices.14 In a validating study, screening medication with STOPP/START was associated with less polypharmacy, fewer incorrect doses, and lower potential drug–drug and drug–disease interactions.19

The STOPP-START criteria applied to health administrative data

STOPP/START (2008 version) have also been applied to prescription health administrative data (HAD), both with and without diagnostic information, in different countries.20–22 Cahir et al20 used a subset of STOPP criteria (n=30) without diagnostic codes (which were not available in the database used), and identified PIP in HAD with a frequency comparable to other clinical studies using the STOPP/START criteria. It has not yet been shown how well the STOPP/START criteria applicable to HAD can identify PIP in HAD compared with clinical data for the same patients.

The Beers criteria

Another long-standing tool to detect PIP are the Beers criteria.1 ,10 ,23 ,24 The Beers criteria were the first explicit criteria to be published23 and have become widely used, particularly in the USA, where they originated.15 Originally developed for use in nursing home residents, the criteria were modified three times, in 1997,24 20031 and 201210 and are now intended for use in all patients above 65 years of age. Despite their popularity, the Beers criteria have been criticised for including obsolete medications, as well as medications no longer available outside the USA, particularly in Europe,6 ,8 though some of these issues were addressed in the 2012 revision.10 ,25–27 The Beers criteria have also been criticised for not being sufficiently inclusive of a number of common instances of PIP.6 ,15 In particular, Beers only lists drugs to avoid, but does not include other categories of PIP, such as drug–drug and drug–disease interactions, drug duplications or underuse and overuse of medications.15 Finally, higher scores on the Beers’ criteria have not been shown to be associated with ADEs, discharge to a higher level of care or increased in-hospital mortality.28 The 2012 Beers criteria update has made them more similar to the STOPP/START criteria than prior iterations. To our knowledge, the performance of the 2012 Beers criteria has not yet been compared with the STOPP/START, which were themselves updated and expanded in 2014.11

Evidence gaps to be filled

Detecting PIP using HAD at the level of whole populations could offer an advantage over the time-consuming and expensive use of clinical tools at the patient level for the same purpose. This is because these data have already been collected and stored in a standardised way, and it is possible to develop computer programs to apply the medication appropriateness criteria to large numbers of people, retrospectively looking at PIPs at the population level to improve quality of prescribing. One could also envision mobile point of care applications that would caution against PIPs in real time. Furthermore, because it transcends the level of the patient, HAD may be used to identify which PIPs were most common at the population level; this could be a first step towards the development of targeted measures for the improvement of prescribing quality.

Owing to the fact that some clinical information necessary to determine the appropriateness of prescriptions is missing from HAD, HAD-based tools for the detection of PIP are likely to generally underestimate the true prevalence of PIP in a population, though there may be exceptions (see figure 1 for a graphical illustration of the relationship between medication assessment tools and data sources). In order to overcome such shortcomings and apply HAD-based tools with confidence, it is necessary to validate them by comparing their performance with that of clinical tools in the same patients, using contemporaneous clinical and HAD data. To our knowledge, no such study of the most promising clinical and HAD-based tools has been undertaken to date. The present study will endeavour to do so; this represents a unique opportunity to validate these medication assessment tools in a robust manner.

Figure 1

Relation between medication assessment tools, data sources and study objectives. PIP, potentially inappropriate prescribing; STOPP/START, Screening Tool of Older Persons’ potentially inappropriate Prescriptions/Screening Tool to Alert doctors to Right Treatment.


The overall aim of this study is to validate medication appropriateness criteria applicable to HAD by comparing their performance when applied to clinical data. The present study has two main and three secondary objectives (see figure 1 for details):

  1. Main objectives:

Objective 1: To validate subsets of the STOPP/START and Beers criteria defined by their applicability to HAD, by comparing their performance in detecting PIP when applied to HAD with that of the full set of criteria applied to clinical data for the same residents (with the clinical data providing the ‘gold standard’).

Objective 2: To compare the detection rates of the full STOPP/START and full Beers criteria with one another when applied to clinical data.

  • B. Secondary objectives:

Objective 3: To assess the number and proportion of unidentified PIP when using the subset of STOPP/START and Beers criteria with HAD when compared with the full set of criteria, when applied to clinical data.

Objective 4: To compare the performance of the subset of the STOPP/START and Beers criteria applied to the HAD in detecting PIP when compared with clinical data for the same residents.

Objective 5: To compare the performance of the subset of STOPP/START and Beers criteria applicable to HAD with one another when applied to HAD.

The resulting performance estimates for the subset of criteria applicable to HAD will enable better estimates of PIP prevalence at the population level; these may in turn help guide policy makers and clinicians in the development of interventions and other measures to monitor and improve the quality of prescribing, particularly in the elderly.

Methods and analysis

Study design

We will conduct a comparative cross-sectional study by linking prospectively collected clinical data and HAD for the same patients. More specifically, we will assess patients’ medication and diagnoses immediately prior to admission to a LTC facility. We will use HAD housed at the Institute for Clinical Evaluative Sciences (ICES), an independent, non-profit organisation funded by Ontario's Ministry of Health and Long-Term Care. ICES databases contain information on hospital and outpatient use of health services, demographic data, and socioeconomic data for over 13 million Ontarians. For patients above 65 years of age who live in Ontario and have a valid health card number, ICES data also contain information about all medication dispensed under the Ontario Drug Benefit (ODB) programme.29 Clinical data will be collected from patients newly admitted to one of six long-term care (LTC) facilities in Ottawa, Ontario, Canada.


Inclusion criteria. All Ontario Health Insurance Plan (OHIP)-eligible patients who are newly admitted to LTC, convalescent or respite care at the participating care facilities and who are aged 66 years and over at the time of admission will be eligible for participation in the study. Informed consent will be obtained from the LTC resident or their substitute decision maker. We will attempt to recruit all patients admitted to the participating LTC facilities during the accrual period; however, convalescent care and respite residents may not all be captured based on the particular admissions processes of one of the facilities.

Accrual period. The accrual period began in June 2014 and will continue until a sample size of 275 residents is accrued, or statistical significance is reached in interim analyses (see ‘Statistical analyses’ section below for details of sample size calculation). Based on available estimates of admission rates, this should take approximately 15–18 months. An overview of the study participant selection process is shown in figure 2.

Figure 2

Study participant selection process (LTC, long-term process).

Exclusion criteria. Residents will be excluded if they decline to participate, or if their substitute decision maker declines participation on their behalf. Residents will be excluded if they do not have a valid OHIP number. Residents residing in Ontario whose healthcare is covered through other plans, and are therefore not captured through ICES data, such as First Nations people living on reserve, members of the Armed Forces and refugee claimants, will also be excluded. Residents will also be excluded if they are not prescribed any medications.

Data sets

This study will be conducted using Ontario administrative health databases housed at the ICES, which will be accessed from the ICES@uOttawa site. ICES is an independent, non-profit organisation whose infrastructure funding and access to Ontario’s large administrative databases is provided by the Ontario Ministry of Health and Long-Term Care. ICES links de-identified population-based health information at the resident level in a way that ensures privacy and confidentiality of residents. ICES is named as a section 45(1) Prescribed Entity in Ontario’s Personal Health Information Protection Act (PHIPA). Review, audit and approval of ICES’ policies, practices and procedures related to data privacy and security are performed triannually by the Information and Privacy Commissioner of Ontario (IPC). This approval/review document is available at All of these data sets are linked using a resident-specific encrypted identifier. This linkage is deterministic and does not require any probabilistic methods.

This study will use the following five data sets30 ,31 with which the clinical data will be linked.

Registered Persons Database: The Registered Persons Database (RPDB) records the birth date and death date (if applicable) of every person eligible for Ontario health services.31

Ontario Drug Benefits Claims Database: The ODB programme provides drug benefits for all adults aged 65 years and over living in Ontario with a valid health card number, as well as those receiving social assistance in Ontario. The dispensing pharmacist submits a claim for each prescribed drug that is covered under the ODB formulary. There are over 50 million drug claims per year. The main data elements of the Ontario Drug Benefits Database include: ICES Key Number (IKN—anonymously linkable to other individual-level data holdings), Drug Identification Number (DIN); drug quantity, number of days supplied (can be used to compute daily dose), cost (split into its elements), LTC indicator, the plan that prescription falls under (such as Seniors, Trillium, Ontario Works, etc), dispensing date, resident and prescriber identifiers (encrypted). ICES also maintains a list of drug identification numbers and the associated drug and product names, subclass information, Pharmacologic-therapeutic Classification Group (PCG) codes, drug strength, route of administration, first and last dispensing dates from ODB.31

Discharge Abstract Database: The Discharge Abstract Database (DAD) captures all acute care hospitalisations in Ontario back to 1988. Each row in the DAD records demographic, diagnostic, procedural, and treatment information for a single hospitalisation.31

National Ambulatory Care Reporting System: National Ambulatory Care Reporting System (NACRS) captures all hospital ED visits back to 2002. Similar to DAD, each row records demographic, diagnostic, procedural and treatment information for a single emergency room visit.31

OHIP database: The OHIP database captures most claims paid by the OHIP. Each row in the OHIP database records the resident, physician and diagnosis/procedure being claimed for remuneration.31

The RPDB will be used to create the cohort of eligible Ontarians; for these patients, we will use the ODB to identify all prescriptions during the study period. The DAD, OHIP and NACRS data sets will be used for HAD disease ascertainment in order to implement the drug–disease interactions specified in the STOPP/START and Beers criteria.

Clinical data collection

Clinical data collected for each study participant, via resident admission records at the LTC facilities, will include results of laboratory and imaging tests performed at or shortly before or after admission (eg, glycated haemoglobin, estimated glomerular filtration rate, serum creatinine, lipid profile (low-density lipoprotein, high-density lipoprotein (HDL), total cholesterol (TC), triglycerides (HDL:TC)), mineral bone density; a list of diagnoses; a list of the resident's current medications at the time of admission; and recent blood pressure, heart rate, weight and height measurements. Table 1 shows the patient characteristics of the final study group. Personal information, including first and last name, date of birth, Ontario health card number and sex will be obtained in order to enable linkage to HAD housed at ICES. This clinical and personal information will be collected by a consultant pharmacist, who is not otherwise involved in the care of the residents. This pharmacist will also assess the residents’ medication for PIP using the STOPP/START criteria, Beers 2012 criteria and their subsets applicable to HAD (see figure 1). The data collection spreadsheet used for assessment of residents’ clinical data is provided as online supplementary appendix B (downloadable Excel spreadsheet). We estimate that the consultant pharmacist will require approximately 1.5 h per resident to complete the data collection spreadsheet and to apply the medication assessment tools.

Table 1

Patient characteristics of final study group

Coding of STOPP/START and Beers criteria for use with HAD

The subset of STOPP/START and Beers criteria amenable to use with HAD were coded into a format applicable to ICES data by three of the co-authors (LMB, RH, CC), and a manuscript describing the coding process and outcome in detail is currently in preparation for publication.

Statistical analysis

We will conduct a series of pairwise comparisons for the difference of two proportions (McNemar test32) in order to address each of our objectives (see figure 1 and table 3). We will also compare the absolute number of PIP identified using the different sets of criteria for each patient (table 2). We will look at the percentage of residents where one or more PIP was identified, as well as the frequency distribution of the number of PIP detected per resident using the different sets of criteria (figure 3). The ‘skeleton’ tables and figures of expected results are shown in tables 2 and 3, and figure 3.

Table 2

Number of PIP detected by different medication assessment tools as a function of data type (number of patients (n) is the same for all tools)

Table 3

Comparative performance of PIP assessment tools relative to each other

Figure 3

Frequency distribution of PIP by medication assessment tool. PIP, potentially inappropriate prescribing; STOPP/START, Screening Tool of Older Persons’ potentially inappropriate Prescriptions/Screening Tool to Alert doctors to Right Treatment.

Sample size and resultant accrual period

For purposes of sample size calculation, we assumed that the subset of STOPP/START criteria applied to HAD will not identify a resident with one or more PIP 10% of the time when compared with the subset of STOPP/START criteria applied to clinical data, our gold standard. This is intended to be a conservative estimate, based on indirect evidence indicating that the subset of STOPP/START criteria (V.2008) applied to HAD yields rates of PIP similar to those from clinical studies.20 We need to recruit 256 residents to identify a significant difference in PIP identification rate (this sample size was calculated based on the McNemar χ2 calculation for matched proportions).32 ,33 We will oversample to 275 residents, to take into account missing data that may cause a resident's record to be unusable. We will perform interim analyses at 50, 100, 150, 200 and 250 residents. Should the interim analyses demonstrate statistically significant differences in our main outcome, the study will be stopped prematurely.

Approximately, 27 LTC residents, plus an additional 21 convalescent care and respite residents (who may be difficult to capture), are admitted to the participating LTC facilities each month. It is assumed that some residents will not choose to participate in the study. Additionally, we are not including convalescent and respite residents in our accrual time estimate due to potential administrative recruiting difficulties. Therefore, based on a monthly estimate of 27 admissions and an estimate that roughly 67% of residents will consent, we expect that it will take approximately 15 months to accrue the necessary number of residents. If we succeed in recruiting convalescent and respite residents, the accrual process may be completed in less time than conservatively estimated.

Ethics and dissemination

We have obtained ethical approval from the Bruyère Research Ethics Board, the Ottawa Health Science Network Research Ethics Board, and the ethics board of the City of Ottawa Long Term Care Homes (see online supplementary appendix A).

Consent to participate in the study will be obtained in a two-step process. On admission to one of the participating LTC facilities, residents or their substitute decision makers will be asked by a research assistant whether they agree to be contacted concerning possible participation in a research study. If they respond in the affirmative by way of a completed ‘Willingness to be Contacted’ form, residents or their substitute decision maker will be contacted by a research assistant, who will offer to tell them about the study either over the phone or in person. The resident or their substitute decision maker will be provided with information about the study, including its funder and its goals, in the language of their choice (French or English), in order to be able to make an informed decision about participation. If in person, a consent form requiring the resident's signature or that of their substitute decision maker will also be provided, otherwise verbal consent will be obtained over the phone.

Once consent has been obtained, the consultant pharmacist will be contacted, will travel to the LTC facility, and will extract the data from the resident's admission chart. In rare cases, the pharmacist may need to contact the resident or their substitute decision maker to clarify certain points of information, but in general, no such contact should be necessary.

The consultant pharmacist will collect the information onsite at the resident's LTC facility, and will enter the information into a data collection spreadsheet on an encrypted laptop. Hard copies (ie, paper) of documents containing resident clinical and/or personal data will not leave the facility, and will either be returned to the staff providing the documents to the consultant pharmacist, or if they are copies made specifically for the consultant pharmacist, they will be destroyed (shredded) confidentially after use at the facility.

Residents or their substitute decision maker may elect to discontinue the resident's involvement in the study at any time.

Dissemination plan

We expect that the results of this study, which will help establish which set of criteria are most applicable for the Canadian setting, will be of interest to a wide range of individuals and groups, including healthcare policy makers, national and provincial professional associations and licensing bodies, clinicians, healthcare consumer organisations, and members of the public. We expect the study to have some international appeal due to the inclusion of the Beers 2012 update, as to the best of our knowledge, this version of the Beers criteria has not been compared with the 2014 version of the STOPP/START criteria before. Several avenues of dissemination will be followed. Throughout the study, we will engage with LTC policy makers at Ontario's Ministry of Health and Long-Term Care, through the Bruyère Centre for Learning Research and Innovation in Long-Term Care (CLRI). We will be working with a knowledge broker from CLRI; she will meet with our team to help us with our dissemination plan. In addition, CLRI will be hosting a large conference in the Fall of 2015, where we intend to provide preliminary results. For end-of-study knowledge dissemination, we intend to publish in medical, health services research and/or public health journals. More importantly, we plan to present and discuss the results of our study with relevant stakeholders, such as the Ontario Ministry of Health and Long-Term Care, the Ontario Long Term Care Physicians Association, Ontario Pharmacists Association (who have a Long-Term Care Working Group), the Ontario College of Physicians and Surgeons, Canadian Pharmacists Association and Local Health Integration Networks (LHIN). We will also tap into our own professional networks, among others by disseminating our results via the Ottawa Rational Therapeutics and Medication Policy Research Group's website (currently under preparation). We will encourage presentation of our work at healthcare and discipline-based conferences, particularly those focusing on LTC, drug safety and primary care. We will also use our informal networks to disseminate our findings regionally, nationally and internationally.


The authors would like to acknowledge the participating Long-Term Care homes, and the helpful collaboration of their Admissions Staff. They thank MH for formatting and editorial suggestions, and Chandra Landry for proofreading and editorial suggestions; they also thank Chandra Landry and CDB for invaluable help with formatting references, tables and figures. Parts of this material are based on data and information compiled and provided by CIHI.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:


  • Contributors LMB conceived of the idea for the study, wrote the original grant application as well as the ethics application (including a description of the proposed study—see online supplementary appendix A) and adapted it to fit the requirements of the present protocol; she also generated all the tables in the present manuscript, as well as in the online appendices. DGM, BF, CR, RH, CC, MW and CDB all reviewed the manuscript for important intellectual content and made suggestions for improvement as appropriate; furthermore, they all approved the final version of the manuscript for submission.

  • Funding Supported with funding from the Government of Ontario through the Bruyère Centre for Learning, Research and Innovation in LTC. This study is supported by the Institute for Clinical Evaluative Sciences (ICES), which is funded by an annual grant from the Ontario Ministry of Health and Long-Term Care (MOHLTC). The opinions, results and conclusions reported in this paper are those of the authors and are independent from the funding sources.

  • Competing interests CR is a coauthor of the STOPP/START criteria.13

  • Ethics approval Ottawa Health Sciences Network Research Ethics Board.

  • Provenance and peer review Not commissioned; peer reviewed for ethical and funding approval prior to submission.