Article Text

Original research
Factors associated with undertriage and overtriage in telephone triage in Danish out-of-hours primary care: a natural quasi-experimental cross-sectional study of randomly selected and high-risk calls
  1. Dennis Schou Graversen1,2,
  2. Anette Fischer Pedersen1,3,
  3. Morten Bondo Christensen1,2,
  4. Fredrik Folke4,5,
  5. L Huibers6
  1. 1Research Unit General Practice, Aarhus University, Aarhus, Denmark
  2. 2Department of Public Health, Aarhus University, Aarhus, Denmark
  3. 3Department of Clinical Medicine, Aarhus University, Aarhus, Midtjylland, Denmark
  4. 4Emergency Medical Services Copenhagen, Copenhagen University Hospital, Ballerup, Denmark
  5. 5Department of Cardiology—Herlev and Gentofte, Copenhagen University Hospital, Copenhagen, Denmark
  6. 6Research Unit for General Practice, Aarhus University Research Unit General Practice, Aarhus, Midtjylland, Denmark
  1. Correspondence to Dennis Schou Graversen; d.graversen{at}


Objectives We aim to explore undertriage and overtriage in a high-risk patient population and explore patient characteristics and call characteristics associated with undertriage and overtriage in both randomly selected and in high-risk telephone calls to out-of-hours primary care (OOH-PC).

Design Natural quasi-experimental cross-sectional study.

Setting Two Danish OOH-PC services using different telephone triage models: a general practitioner cooperative with GP-led triage and the medical helpline 1813 with computerised decision support system-guided nurse-led triage.

Participants We included audio-recorded telephone triage calls from 2016: 806 random calls and 405 high-risk calls (defined as patients ≥30 years calling with abdominal pain).

Main outcome measures Twenty-four experienced physicians used a validated assessment tool to assess the accuracy of triage. We calculated the relative risk (RR) for clinically relevant undertriage and overtriage for a range of patient characteristics and call characteristics.

Results We included 806 randomly selected calls (44 clinically relevant undertriaged and 54 clinically relevant overtriaged) and 405 high-risk calls (32 undertriaged and 24 overtriaged). In high-risk calls, nurse-led triage was associated with significantly less undertriage (RR: 0.47, 95% CI 0.23 to 0.97) and more overtriage (RR: 3.93, 95% CI 1.50 to 10.33) compared with GP-led triage. In high-risk calls, the risk of undertriage was significantly higher for calls during nighttime (RR: 2.1, 95% CI 1.05 to 4.07). Undertriage tended to be more likely for calls concerning patients ≥60 years compared with 30–59 years (11.3% vs 6.3%) in high-risk calls. However, this result was not significant.

Conclusion Nurse-led triage was associated with less undertriage and more overtriage compared with GP-led triage in high-risk calls. This study may suggest that to minimise undertriage, the triage professionals should pay extra attention when a call occurs during nighttime or concerns elderly. However, this needs confirmation in future studies.

  • clinical audit
  • health & safety
  • quality in health care
  • primary care

Data availability statement

Data are available upon reasonable request.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • A strength of our study is the natural quasi-experimental design using real-life calls as opposed to a constructed setup.

  • To our knowledge, this is the first study to explore factors associated with undertriage and overtriage in real audio-recorded calls.

  • A limitation is the use of only one assessor per call, but acceptable interrater agreement was previously found.


In out-of-hours primary care (OOH-PC), telephone triage plays a pivotal role in managing patient flows and workload.1–3 Telephone triage aims to ensure a safe and efficient delivery of healthcare, avoiding undertriage and minimising overtriage.4 However, accurate telephone triage is difficult due to the lack of visual cues of the patient, challenges in telecommunication and time pressure.5–8

Safety and efficiency of telephone triage in OOH-PC have been explored in a range of studies that used varying outcome measures.4 9–13 Studies have identified a range of risk factors for potentially unsafe telephone triage in out-of-hours (OOH) care: calls concerning abdominal pain,14–20 chest pain16 17 and shortness of breath,16 18 calls for patients with increasing age14 17 21 and calls during nighttime.17 22 Thus, these calls can be seen as potential high-risk calls. A few studies have explored factors associated with overtriage as a measure of inefficient telephone triage. Nurse-led triage1 23 and triage aided by computerised decision support system (CDSS)23–26 have been associated with overtriage. However, studies that explored factors associated with undertriage and overtriage used a range of study designs. None of these studies used real audio-recorded telephone calls to assess the risk of undertriage and overtriage and related risk factors.

Previously, we found that telephone triage was associated with significantly less undertriage and significantly more overtriage for nurse-led triage compared with GP-led triage in a sample of random calls.23 However, it remains unclear whether this association also exists for high-risk calls. Therefore, we aim to investigate the risk of undertriage and overtriage in high-risk telephone calls to OOH-PC in Denmark. In addition, we aim to explore patient characteristics and call characteristics associated with undertriage and overtriage in random calls and in high-risk calls to OOH-PC.


Design and setting

This paper presents secondary analyses of a natural quasi-experimental study in two regional OOH-PC services in Denmark.23 27 In one prior paper,23 we explored safety and efficiency in randomly selected calls, using undertriage and overtriage. In the present paper, we included potentially high-risk calls and studied undertriage and overtriage of these calls. Subsequently, we explored factors associated with undertriage and overtriage in randomly selected calls and in high-risk patient calls.

OOH-PC is provided by the general practitioner cooperative (GPC) in the Central Denmark Region since 1992 and by the medical helpline 1813 (MH-1813) in the Capital Region of Denmark since 2014. Both services are open outside office hours (ie, on weekdays from 16:00 to 8:00, weekends and national holidays), offering telephone consultations, clinic consultations and home visits. The GPC and MH-1813 use different triage models. The GPC uses GP-led telephone triage without CDSS, whereas the MH-1813 uses nurse-led telephone triage (see table 1). At the MH-1813, the majority of incoming unselected calls (ie, 74%) are answered by registered nurses.28 Nurses are obliged to use a locally developed CDSS and they can redirect calls to a physician on call (ie, 11% of all incoming unselected calls answered by a nurse).28 These physicians, who have various medical specialties, also conduct telephone triage. In the present paper, we excluded these calls from our analyses (ie, triaged by physicians).

Table 1

Description of the organisation and telephone triage of the two included models for out-of-hours primary care

Definition of groups of calls

We defined two groups of calls: (1) a group of randomly selected calls that was representative for all calls to OOH-PC and (2) a group of potentially high-risk patient calls (referred to as ‘high risk’). To define the high-risk patient calls, we conducted a systematic literature search in 2016 to identify factors associated with unsafe or inaccurate telephone triage. Seven identified studies described factors associated with a risk of unsafe telephone triage: infants,22 increasing patient age,14 17 21 calls during nighttime,17 22 abdominal pain,14–18 chest pain16 17 and breathing difficulties.16 18 In a consensus meeting, the authors (DSG—medical doctor, AFP—psychologist, MBC—general practitioner, LH—medical doctor) defined criteria for high-risk calls as calls concerning patients above 30 years who suffered from abdominal pain. We added the age criterion, as we aimed to include potentially dangerous conditions that present with vague, indistinctly or greatly differing symptomatology (such as dissecting aorta aneurism, myocardial infarction, cholecystitis, pyelonephritis, acute pancreatitis, gastrointestinal ulcer and gynaecological causes). Due to pragmatic considerations, we focused on one reason for encounter; selecting high-risk calls had to be done by manual development of an algorithm (see further).

Selection of calls

All calls to the GPC and MH-1813 outside office hours during the inclusion period (GPC: 23 November to 7 December 2016; MH-1813: 23 November to 8 December 2016) were potentially eligible. For calls redirected by a nurse to a physician at MH-1813, only the part conducted by the nurse was eligible and later assessed as described below. Randomly selected calls were selected from all eligible calls; these calls were the same as we studied in a prior paper.23 The aim was to include 435 calls by GPs and 435 by nurses, based on a power calculation to detect a 5% difference in undertriage between nurse-led and GP-led undertriage in line with the aim of the main study.23 Based on expected exclusion, we randomly selected 525 random calls from GPC and 500 calls from MH-1813.

To identify potential eligible high-risk calls, we received electronic patient journal records from the GPC and MH-1813. We defined a list of wordings and abbreviations associated with abdominal pain, which was used to search the patient records and identify all eligible calls concerning abdominal pain. DSG (medical doctor) assessed these marked patient records to check whether inclusion for abdominal pain was correct. Calls were excluded when the triage professional in the patient record dismissed the presence of abdominal pain (ie, triagist noted no abdominal pain) or when the complaint was clearly outside the thoracic, abdominal or pelvic region. Thus, we identified 846 (GPC) and 884 (MH-1813) eligible high-risk calls for potential inclusion. A power calculation was designed to detect a significant difference between nurse-led and GP-led undertriage for high-risk calls, without detecting associations between risk factors and over-and undertriage. As this calculation revealed the need of 206 calls per triage model, we randomly selected 252 calls from the GPC and 240 calls from MH-1813. Selection of both randomly selected and high-risk calls was done, matching the overall distribution on day of week (ie, weekend, not weekend) and time of day (ie, day, evening, night).

Each selected call had a unique identification number that was used to identify the corresponding audio recording. Three master students of medicine masked the audio-recordings using beep tones to cover information revealing triage profession, OOH organisation and patient identification information. In addition, the students screened all calls for fulfilling exclusion criteria as shown in online supplemental appendix 1. DSG reviewed all calls that fulfilled exclusion criteria and those that were unclear. In case of doubt, DSG and AFP reached consensus regarding exclusion.

Assessment of accuracy of triage

All included calls have been assessed as described in prior papers,23 29 using the tool ‘Assessment of Quality in Telephone Triage’ (AQTT). The AQTT comprises 24 items assessing the health-related quality and the quality of communication.29 AQTT has a rating manual, describing when to assign the different ratings for each item.29 In the present study, the outcome accuracy of triage was measured by one item that used a 7-point scale to differentiate between levels of undertriage (ratings 1 to 3), optimal triage (rating 4) and overtriage (ratings 5 to 7). The AQTT showed inter-rater disagreement when using the entire 7-point scale for assessment of the accuracy of triage but revealed satisfactory inter-rater agreement when using dichotomous scales for clinically relevant undertriage (ratings ‘1’ and ‘2’ vs ratings ‘3’ to ‘7’) and clinically relevant overtriage (ratings ‘6’ and ‘7’ vs ratings ‘1’ to ‘5’).29

For calls that triage nurses redirected to a physician, only the part conducted by the nurse was available for assessment. These calls could be assessed as optimal, if the decision to redirect the call was what would be expected by a nurse.

For the assessment panel, we recruited 24 physicians among triage professionals from the GPC and MH-1813 using two inclusion criteria: (1) >1 year experience and (2) active in telephone triage in OOH-PC at time of study. We randomly selected 16 GPs from the 56 interested GPs from the GPC. At MH-1813, we included all eight physicians fulfilling our inclusion criteria from the 10 interested physicians. All assessors followed a 2-day training course providing knowledge on telephone triage and communication, introducing the AQTT and rating manual, and assessing triage calls individually and in plenary, focusing on achieving consistency. We randomly distributed calls to all assessors, so each member of the assessment panel assessed random and high-risk calls triaged with both the GP-led model and the nurse-led model. Assessors were blinded for the type of call and triage model. Information on age and sex of the patient, day of week and the time of each call was available for the assessor, extracted from the registration systems from the GPC and MH-1813.

Statistical analyses

Accuracy of triage decision was categorised into clinically relevant undertriage (rated ‘1’ or ‘2’) and clinically relevant overtriage (rated ‘6’ or ‘7’). We conducted separate analyses for randomly selected calls and for high-risk calls. We used descriptive analyses to describe patient characteristics and call characteristics as well as the risk of clinically relevant undertriage and clinically relevant overtriage for both type of calls. We explored the association of individual patient characteristics (ie, age, patient sex) and call characteristics (ie, weekend, time of day, triage model) with inaccurate telephone triage, by calculating the risk ratio (RR) of having clinically relevant undertriage (vs no clinically relevant undertriage) and clinically relevant overtriage (vs no clinically relevant overtriage), using binomial regression. 95% CIs were calculated. All analyses were performed in Stata V.14.2 (StataCorp. 2015. Stata Statistical Software: Release 14.2. College Station, Texas: StataCorp LP).

Public and patient involvement statement

We explored patients’ perspective in a focus group interview concerning the development of the AQTT and incorporated received input in the rating manual of the AQTT.29


Description of calls

Figure 1 shows the flowchart of selection and exclusion of calls. We excluded 47 randomly selected calls and 30 high-risk calls assigned ‘not applicable’, as assessing accuracy of triage was not possible (eg, insufficient information was available) or not relevant. Thus, our final study population included 806 randomly selected calls and 405 high-risk calls (table 2). In high-risk calls, the risk of clinically relevant undertriage was 7.9%, whereas the risk of clinically relevant overtriage was 5.9%.

Figure 1

Flowchart of inclusion and exclusion of randomly selected and high-risk calls from the GPC and MH-1813. Not applicable: accuracy of triage could not be assessed due to insufficient information in the call or assessment of accuracy of triage was deemed not relevant. GPC, general practitioner cooperative.

Table 2

Baseline description of patient and call characteristics in randomly selected calls and high-risk calls

Risk factors for inaccurate triage in randomly selected calls

For randomly selected calls, we found no significant association between patient characteristics (ie, age and sex), call characteristics (ie, weekend and time of call) and the risk of clinically relevant undertriage or overtriage (table 3).

Table 3

Patient characteristics and call characteristics associated with clinically relevant undertriage and overtriage in randomly selected calls*

Risk factors for inaccurate triage in high-risk calls

Nurse-led triage was associated with significantly less clinically relevant undertriage than GP-led triage (RR=0.47, 95% CI 0.23 to 0.97) in high-risk calls (table 4). Nurse-led triage had significantly more clinically relevant overtriage (RR=3.93, 95% CI 1.50 to 10.53) compared with GP-led triage. High-risk calls conducted during nighttime had a significantly higher risk of being clinically relevant undertriaged (13.2%) compared with calls conducted during day or evening (6.4%) (RR=2.1, 95% CI 1.05 to 4.07). We found no significant association for the other patient characteristics and call characteristics. However, a trend was seen for patient age, as the risk of clinically relevant undertriage was lower in patients aged 30–59 years (6.3%) in comparison with elderly patients ≥60 years (11.3%) (RR=0.55, 95% CI 0.29 to 1.07).

Table 4

Patient characteristics and call characteristics associated with clinically relevant undertriage and overtriage in high-risk calls


Principal findings

In high-risk calls, nurse-led triage was associated with significantly lower risk of clinically relevant undertriage and significant higher risk of clinically relevant overtriage compared with GP-led triage. For high-risk calls, the risk of clinically relevant undertriage was significantly higher if the call was conducted during nighttime compared with day and evening. For randomly selected calls, we found no significant association between defined risk factors and the risk of clinically relevant undertriage or overtriage.

Interpretation and comparison of results

Our study revealed that the triage model and time of call had an effect on accuracy of triage. In a prior study, we found that nurse-led triage had less clinically relevant undertriage and more clinically relevant overtriage than GP-led triage in randomly selected calls.23 In this study, we found the same tendencies in a selection of potential high-risk calls. Our study could not elicit which factors of the triage models influence the difference between nurse-led and GP-led triage in high-risk calls, such as the role of CDSS, professional background, working conditions and organisational conditions. Nurses at the MH-1813 were obligated to use CDSS, whereas GPs did not use CDSS. CDSSs aim to ensure consistency30 31 and to have a high degree of safety (ie, low level of undertriage), which consequently leads to a higher level of overtriage. Furthermore, telephone triage in OOH-PC serves as a form of gatekeeping to acute healthcare in Denmark. A qualitative study described that nurses did not consider themselves as ‘gatekeepers’, but as ‘service providers’.32 Hence, perceptions of the task at hand may differ between nurses and GPs, thereby affecting overtriage, which could be seen as a service to the callers.

We found that high-risk calls during nighttime were significantly more likely to be undertriaged than calls during day or evening. This corresponds to a study by Hayward et al, finding that patients calling during low call volume (eg, nighttime) had a higher risk of requiring secondary care within 3 days after the OOH-PC contact.17 Our study cannot elicit the mechanisms behind the increased risk of undertriage during nighttime. Fatigue of the triage professional could play a role. Moreover, a stricter triage and gatekeeping function may be conducted during nighttime due to different organisational setup with less capacity of staff and consultations.

Our study may suggest that being elderly could influence the risk of undertriage in high-risk calls. This non-significant trend corresponds to prior studies, which found that increasing age was associated with unsafe triage.14 17 21 One could hypothesise that elderly may tend to wait longer before contacting OOH-PC, which could result in calls more at risk of being urgent.

Implications for future research and clinical practice

Although the risk of inaccurate triage differed between nurses and GPs, knowledge of mechanisms behind this difference are lacking and need further exploration. Calling during nighttime was associated with higher risk of being undertriaged for high-risk patient calls, which may be related to a change in available resources and an urge to increase gatekeeping. Future studies should explore the effect of using a CDSS, working in different working conditions and organisational conditions or having different professional background on the level of undertriage and overtriage. Moreover, the influence of other patient characteristics (eg, socioeconomically factors and age) and of health complaints presented in the call are relevant themes to study in relation to the accuracy of triage. From a clinical perspective, this study suggests that triage professionals preferably should pay extra attention when making a triage decision in calls concerning abdominal pain conducted during nighttime and that extra attention may be focused when call is concerning elderly.

Strengths and limitations of the study

A major strength was the quasi-experimental design using real-life calls as opposed to the constructed setup used in previous studies.18–20 30 31 An additional strength was the meticulous assessment process using the validated AQTT tool combined with a comprehensive rating manual.29

Our study had the following limitations. Due to the thorough assessment process, each call was assessed by one assessor. Consequently, bias due to misclassification cannot be rejected. However, we took several precautions to ensure consistency of assessments. The assessors followed a comprehensive training course and assessments followed the meticulously developed and validated AQTT.29 Furthermore, we attempted to mask the audio-recordings for information about organisation and triage model. Also, assessors were not aware of the design with both randomly selected and high-risk calls. Moreover, we dichotomised accuracy of triage into clinically relevant undertriage and overtriage, which had a satisfactory inter-rater agreement of the AQTT.29 We carefully decided on our definition of high-risk calls (ie, calls for adults >30 years with abdominal pain). The cut-off point for age was reached through meticulous discussions among the authors, but could be too low, thereby including a larger group of low-risk calls. Another limitation was our small sample size. In line with the main study, our power calculation was made to identify a significant difference between nurse-led and GP-led undertriage and overtriage. As only a selection of these calls was assessed as clinically relevant undertriage or overtriage, the present study lacked power to identify risk factors for undertriage and overtriage. Therefore, we explored clinically relevant associations of patient characteristics and call characteristics with inaccurate triage. Furthermore, we chose physicians as assessors of the accuracy of triage, as they were most frequently used in other studies. The decision to include only physicians in the assessment panel may have induced similar-to-me cognitive bias when assessing nurse-led triage, leading to underassessment of the accuracy of nurses’ triage decisions. Additionally, knowledge about the time of the triage call could have resulted in bias, as the assessors may have a different threshold for assessing a decision as accurate during nighttime versus daytime. However, as we used experienced triage physicians as assessors, who assessed a call using their clinical experience, we expect this bias to have minor influence. Also, our study was underpowered to perform multivariate analyses, so we cannot ignore potential confounding concerning the associations found. Finally, we needed to exclude calls for which the level of accuracy was assessed as not appicable, as it could both reflect a correct performance (ie, this item correctly found not relevant) or potentially cover a poor performance (ie, available information is insufficient for assessment).


This study found that high-risk calls triaged by nurses were less likely to be undertriage and more likely to be overtriage compared with calls triaged by GPs in OOH-PC. High-risk calls conducted during nighttime were significantly more likely to be undertriaged than those during day and evening.

Data availability statement

Data are available upon reasonable request.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants but The National Committee on Health Research Ethics in the Central Denmark Region was consulted and found that no approval was required for this study [personal communication—ref.number:125/2015]. The study was approved by the Danish Patient Safety Authority (reference number: 3-3013-1274/1). Consent was not obtained as approved by the Danish Patient Safety Authority (reference number: 3-3013-1274/1). The project (ID: 200) has been approved and is registered in the Record of Processing Activities at the Research Unit for General Practice in Aarhus in accordance with the provisions of the General Data Protection Regulation (GDPR).


The authors would like to thank the 24 assessors who participated in the assessment process and the patients who gave valuable feedback during the development of the AQTT. The authors thank the MH-1813 and the GPC organisations. A special acknowledgement to statistician Claus H Vestergaard for providing valuable feedback in analyses.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Contributors All authors contributed to the development of the study protocol and design. DSG (guarantor) collected the calls, conducted the statistical analyses, and produced the first draft of the manuscript. DSG, AFP, LH, MBC, and FF (medical doctor) contributed to the interpretation of data and critically revised the manuscript. All authors contributed with proofreading of the manuscript.

  • Funding This study was supported by the Danish foundation TrygFonden (104893), Primary Health Care Research Foundation of the Central Denmark Region (Praksisforskningsfonden—1-15-1-72-13-09), the Committee for Quality Improvement and Continuing Medical Education in general practice in the Central Denmark Region (Kvalitets- og Efteruddannelses­udvalget - 1-30-72-227-15) and the Committee of Multipractice Studies in General Practice (Multipraksisudvalget - 15/1880).

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were involved in the design, or conduct, or reporting, or dissemination plans of this research. Refer to the Methods section for further details.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.