Statistics from Altmetric.com
- hospital medicine
- medical decision-making
- patient-physician communication
- physician behaviour
- shared decision-making
Strengths and limitations of this study
The study comprises a large material of video-recorded patient–physician encounters including 17 different clinical specialties and three practice settings (outpatients, inpatients on the ward and emergency room).
Statistical analyses of decisions within various categories were performed by estimating linear mixed models accounting for random and fixed effects to ensure that observed differences were not attributable to significant clustering at doctor level.
The study was conducted by applying a novel taxonomy that identifies and classifies clinically relevant decisions in a substantially broader way than previous studies describing the number of decisions in medical encounters.
The encounters were recorded at a single hospital over a limited time period, and the taxonomy has not been tested in general practice or psychiatry.
Decision making is a key activity—perhaps the key activity—in healthcare.1 Alvan Feinstein’s 1967 harbinger ‘Clinical Judgment’2 spawned a body of research and theory that has advanced the field of decision making in healthcare.1 3–7 Feinstein later concluded8 that the field’s emphasis on quantitative models derived from non-clinical sources had left central challenges on how decisions are made at the bedside or in the clinic open for pursuit.
In the context of patient–physician encounters, decision-making processes result in diagnoses, choice of treatment, selection of tests, provision of relevant information and scheduling of follow-up—or the decision to do nothing. Traditionally, these decisions have been made by the physician. In recent decades, these decisions—that govern how resources and time are invested in the care of patients—are all under increasing pressure to live up to normative standards like evidence-based medicine (EBM), patient-centred care, patient safety culture and provider professionalism.
In both research and clinical practice, the focus has often been on single decisions related to a specific context. In EBM, the aim is to formulate an answerable question, search the literature, critically appraise the information and build the decision-making process around best available evidence together with patient values and preferences.9 Patient safety programmes select key triggers identifiable as the cause of adverse events, with the aim of flagging them for prescriptive measures.10 11 In the context of patient-centred care, decisions are increasingly framed within a shared decision making (SDM) paradigm. Research and implementation of SDM often target single decisions related to a specified, predetermined topic, focusing on difficult decisions with two or more options that patients may weigh differently.12–14
Only a handful of studies have attempted to describe the frequency and types of decisions that are made in medical encounters.15–19 These studies all aimed to assess the level of patient involvement in decision making. In two of the studies, Braddock et al 15 defined a medical decision as ‘a verbal statement committing to a particular course of action’. This definition is broad, including actions leading to diagnostic tests, prescriptions, referrals and instructions regarding diet and physical activity. However, it does not capture decisions that govern the subsequent ‘courses of action’, such as evaluations of findings and tests, and interpretations concerning diagnosis, prognosis and aetiology.
Decision scientists20 21 describe ‘problem solving’ and ‘decision-making’ as two separate cognitive processes, and in theory this is a sensible distinction. However, ‘problem-solving’ in medicine often involves ‘decision-making’, best illustrated by the fact that diagnostic conclusions seldom reveal themselves, they have to be produced by someone.22 Often, the path to diagnostic judgements and therapeutic actions present options that require decision making and, due to both medical and contextual complexity, leave room for interpretation.23
Our starting point was that normative and prescriptive approaches to clinical decision making need a descriptive framework for identification and classification of clinical decisions that is precise, detailed and exhaustive. In other words, before one can assess the quality of a clinical decision, one must know what the decision is and what it is based on. In a previous study, we developed a taxonomy for identifying and classifying all clinically relevant decisions, both judgements and actions.24 25 Building on the work by Braddock et al, we defined a clinically relevant decision as ‘a verbal statement committing to a particular course of clinically relevant action and/or statement concerning the patient’s health that carries meaning and weight because it is said by a medical expert’.25 We applied this definition and the taxonomy to 372 videotaped hospital encounters in order to identify and classify all clinical decisions that emerged in hospital-based patient–physician encounters and to compare different categories of decisions across clinical settings and personal characteristics.
The process of establishing a sensitive definition of a decision in a clinical context, the identification of decisions and the development of a novel taxonomy has been described in detail elsewhere.24 25 The analytic process was informed by the three prototypical strategies for qualitative research, as described by Crabtree and Miller.26 The two fundamental questions describing the core process of the first of the three methods coincide with our initial research questions (in brackets):
What are the content and constituent elements (of clinically relevant decisions)?
When does it (a clinically relevant decision) begin?
Our choice to broaden a definition of clinical decisions was based on three criteria: all decisions1 must require some element of medical judgement2; must relate to the actual patient’s concrete situation (ie, are therefore distinct from general medical information); and therefore,3 represent important conclusions relevant for the patient to understand and remember, even if not presented as decisions as such. We chose these criteria with the clear aim to describe the medical decisional landscape as it is presented to patients in face-to-face interactions with physicians.
We built a taxonomy with two dimensions: a topical dimension with 10 categories and a temporal dimension with three categories (see table 1). The taxonomy was named DICTUM, or the Decision Identification and Classification Taxonomy for Use in Medicine (a full and updated version of the codebook is available at www.ocher.no/resources/dictum).
Available for our study by broad consent were 380 video-recorded patient–physician encounters collected during 2007–2008 as a part of a randomised controlled trial (RCT) to evaluate the effect of a 20-hour communication skills course.27 The original RCT comprised 497 encounters, and for 380 of these, both patient and physician provided written consent for the video to be available for other communication studies until 2020. In the remaining 127 encounters, either the patient, the physician or both limited the written consent to the RCT only. The physicians were randomly drawn from all physicians under 60 years of age working in non-psychiatric clinical departments. Patients were recruited consecutively on the days the participating physicians were available. While the patients and physicians gave broad consent to further studies of communication, they were unaware of our subsequent focus on identification and classification of decisions.
Analysis of the encounters was done through direct observation of the videotapes. Before formal coding began, we evaluated how consistently we were able to use the taxonomy as a team. Using a maximum variation approach,28 we selected sets of five videos from different clinical settings and specialties, with variation in gender and age in both patients and physicians. The four researcher/physicians coded independently, and this process was repeated three times, resulting in minor adjustments to taxonomy categories the first two times and reaching satisfactory consistency on a final version the third time. We tested reliability using Krippendorff’s alpha agreement for content coding with multiple coders29 and coded a final set of five new videos resulting in a Krippendorff’s alpha of 0.79. For coded variables to be reliable, cut-off value for Krippendorff’s alpha has been set at 0.80.29 Using the categories of the taxonomy, we created a coding scheme in the observation software ‘Observer XT’ (Noldus Information Technology, Wageningen, The Netherlands). All 372 videos were coded by EHO. Every 20th video was coded independently by PG to check for drift. Two-coder inter-rater reliability was good (Cohen’s kappa of 0.61). Intra-rater reliability for EHO, who coded five videos sampled with maximum variation 1 year after the initial coding, was good (Cohen’s kappa 0.77).
Once coding was completed, we calculated simple descriptive statistics30 using IBM SPSS Statistics V.34. In the analysis, patients and physicians were stratified according to gender, relevant age groups, specialty of physician and type of encounter. The data exhibit hierarchical structure with decisions nested within the doctor and the doctor nested within the specialty. The number of decisions within various categories was thus compared by estimating linear mixed models with random effects for doctors nested within specialty or for doctors only. Akaike’s information criteria (AIC)31 was applied to choose the best model with respect to random effects. The distribution of number of decisions across three temporal categories in three different settings was compared by estimating a linear mixed model with fixed effects for temporal category, setting and interaction between the two. The model assessing the number of decisions within each topical category contained fixed effects for settings. The differences in the average number of decisions between various categories of characteristics of patients and doctors were assessed by first estimating a bivariate linear mixed model for number of decisions with fixed effect for relevant characteristic. Next, a multiple model was estimated. As judged by AIC, a model with random intercepts for doctors only fitted data best, hence specialty was included into the model as a fixed effect instead. All linear mixed models were estimated by SAS MIXED procedure using SAS V.9.4.
Of 103 invited physicians, 71 (69%) consented to participate in the original trial and 59 (57%) provided broad consent. Of 553 patients approached, 519 (94%) agreed to have their encounter videotaped for the original study and 445 (80%) provided broad consent.32 In 65 of the encounters where patients had provided broad consent, the physicians had not, leaving a total corpus of 380 videotaped encounters available for analysis. Of these, eight were excluded from the final analysis: one encounter was incompletely captured (showing only six of 53 min), and one physician whose seven encounters all exceeded 90 min was excluded, as this practitioner represented an extreme outlier. We further analysed 372 videotapes, which contained 4976 decisions. The average number of decisions per encounter was 13.4, min–max 2–40, SD 6.8.
Characteristics of participants and encounters
The characteristics of physicians and patients are shown in table 2. The average duration of the 372 encounters was 22 min (min–max 3–66). In 87 (27%) of 372 of the encounters, communication was observed as challenging either because the patient was a child or an immigrant with limited Norwegian fluency. In three encounters, the patient was a child with immigrant parents with limited Norwegian fluency.
The online appendix table shows that categories 1–19 and 21 of the International Statistical Classification of Diseases and Related Health Problems Revision 1033 were present in the material, with diseases of the circulatory system (13%) and neoplasms (10%) being most frequent. Of the 372 encounters, 81 (22%) contained a clinical procedure comprised by the Norwegian classification of surgical and medical procedures, the most frequent being obstetrical or gynaecological ultrasound (27%) and echocardiography (21%)
Supplementary file 1
Characteristics of clinical decisions
Table 3 shows the distribution of decisions across the taxonomy’s 10 topical categories. The two categories identifying clinical judgements, namely ‘defining problem’ and ‘evaluating test result’ together accounted for 47% of decisions, and were also the two categories present in the largest proportion of encounters (95% and 78%, respectively). Decisions categorised as ‘drug-related’, ‘contact-related’, ‘gathering additional information’ or ‘advice and precaution’ were frequently present in a majority of the encounters. The less frequent categories, ‘therapeutic procedure-related’ ‘deferment’, ‘legal and insurance-related’ and ‘treatment goal’, together accounted for 12% of the decisions but were present in 38%, 35%, 18% and 15% of encounters, respectively.
Table 4 presents the distribution of topical and temporal categories by clinical setting. Decisions made here-and-now were the most frequent in all settings, but as many as 39.3% of the decisions conveyed on ward rounds (WRs) had been made before the encounter started. The proportion of preformed decisions was significantly higher in these encounters than in the other two settings (P<0.001). Emergency room (ER) encounters contained a significantly larger proportion of decisions in the category ‘gathering additional information’ compared with outpatient (OP) and WR encounters (P<0.001) and a significantly smaller proportion of ‘defining problem’ statements compared with WR encounters (P=0.028). WR encounters comprised a significantly larger proportion of ‘drug-related’ decisions than OP encounters (P=0.031). OP encounters contained a significantly larger proportion of advice and precaution statements than ER encounters (P=0.035). There were no significant differences in proportions between the three settings in the other topical categories. With regard to temporality, the topical categories ‘evaluating test result’, ‘defining problem’ and ‘drug-related’ accounted for 78% of the preformed decisions, while ‘drug-related’, ‘contact related’, ‘advice and precaution’ and ‘therapeutic procedure-related’-statements made up 77% of the conditional decisions.
Table 5 shows the average number of decisions per encounter distributed across gender, age, setting and specialty with corresponding 95% CI According to the multiple linear mixed model, there were no significant differences for patient or physician gender, age or setting. Female physicians communicated 14.7 decisions per encounter, while male physicians communicated 12.7 (P=0.053). Compared with internists who had on average 15.7 decisions per encounter, ear–nose–throat (ENT) physicians and obstetrics and gynaecology physicians communicated significantly fewer decisions: 7.1 (P=0.006) and 11.0 (P=0.023), respectively. Compared with ENT physicians, neurologists and paediatric physicians communicated significantly more decisions: 13.6 (P=0.029) and 13.4 (P=0.041), respectively. Besides internists and ENT physicians, the remaining six groups of hospital specialists had on average between 11.1 and 13.6 decisions. Of the 628 ‘drug-related’ decisions, 299 were found in the 121 internal medicine encounters, meaning an average of 2.5 (SD=2.3) ‘drug-related’ decisions per encounter, compared with an average of 1.3 (SD=1.9) in the other specialties combined (P=0.002).
Figure 1 illustrates the average number of decisions communicated by each physician in their encounters (2–8 encounters per physician). The three physicians who averaged the highest (29.5, 23.5 and 23.3, respectively) were women. The remaining physicians averaged between 6.7 and 20.5 decisions. The range of decisions per encounter varied substantially from physician to physician, the smallest range was 5 (9–14) and the largest was 29 (11–40).
We set out to identify and classify all clinically relevant decisions communicated in 372 hospital encounters using the novel taxonomy DICTUM.24 We found that patients, on average, were exposed to more than 13 medically relevant decisions per patient–physician encounter. The encounters in this study were representative of everyday activity in non-psychiatric clinical departments in a large Norwegian hospital. Across topical categories, decisions were diverse; although diagnostic decisions predominated, almost half were of other kinds. Across temporal categories, the majority of decisions were made in the present, but a substantial amount was brought into the encounter as new information, or presented as conditional, depending on future trajectories. With the exception of internal medicine and ENT encounters, we found only minor differences among disciplines. Also, decision frequencies were not associated with patient or physician characteristics. Could this resemblance between specialties and physicians, indicate that DICTUM captures a general structure of how decisions are communicated in medical encounters?
Observed differences, for example, a higher frequency of preformed decisions in ward rounds, a lower total frequency in ENT encounters, more ‘gathering information’ decisions in ER encounters and more ‘drug-related’ decisions in internal medicine encounters, are all findings that could be expected from these different clinical contexts. WR encounters are commonly preceded by chart review, huddles or formal meetings where providers, either alone or as a team, make judgements and decisions without the patient present. ENT encounters commonly deal with only one concern. In ER encounters, the diagnostic process is at its earliest and gathering additional information through tests or consulting with a colleague or a next of kin is what drives the process forward. Internists deal with more drug-related decisions, partly because their patients often have several previous medications in need of review and partly because diseases cared for by internists frequently have the potential for improvement or prevention through pharmaceutical therapy.
The difference between male and female physicians represents two decisions per encounter; however, this difference was not statistically significant, and we are not convinced that the difference is of clinical significance. On the individual level, however, the averages and ranges of decisions varied greatly and also within disciplines. Illustrated by averages and ranges, respectively, figure 1 shows large interphysician and intraphysician variability: the first possibly reflecting each physician’s communication style, and the latter possibly associated with the patient’s communication style and the relevant clinical context.
One may challenge our definition of decisions. Previous studies of decisions in patient–physician encounters have reported substantially lower frequencies, varying between on average three and seven decisions per encounter in five different studies.15–19 Each of these studies have identified decisions with the aim of describing patient involvement in decisions. These studies did not include diagnostic decisions (comprised by our first three categories); if diagnostic decisions are subtracted from our material, our findings align with the findings from previous studies. The inherent elements of medical encounters that we have defined as diagnostic decisions have, in previous studies, been framed as clinical questions that physicians attempt to answer. Ely et al developed a taxonomy of clinical questions to assess how physicians deal with the challenges of treatment, choice of tests and also diagnosis, prognosis and aetiology, by building their framework around clinical questions instead of the judgements and decisions that produce the answers.34 35 DICTUM may help studies on how physicians and patients deal with and answer these clinical questions in dialogue.
A detailed and exhaustive description of clinical decisions, as they appear to patients in medical encounters, could aid clinical studies and assessments of real-life practice with normative or prescriptive aims. DICTUM offers the possibility of assessing all points in time where decisions are communicated. The basis of diagnoses, aetiology, prognoses, care plans, follow-up, use of time and resources can all be scrutinised with a normative approach on provider or system level. Additional relevant data would be necessary to distinguish between desired standard and substandard medicine. Such data, for example, patient or physician surveys or interviews, patient chart reviews or peer review of encounters, could be collected at the time of decision making and also followed up at a later stage. For inpatient care, an observation framework exceeding the duration of the patient–physician encounter could shed light on which and how decisions are made when the patient is not present—decisions that we in this study observe are presented to patients as information (‘preformed decisions’).
Introducing physicians and patients to the DICTUM taxonomy before a clinical encounter might affect how decisions are made and communicated. Discussing the observed decisions with physicians and patients after the encounter could provide insight into the lapses in comprehension, meaning and implications of the information shared during the encounter. Providers and institutions strive to deliver high-quality care, increasingly focusing on evidence, patient preferences, safety, efficiency and use of resources. Raising awareness around which decisions need to be made, how they are made and who should make them may not have causal effect on performance, but it would put the punctuation marks of care out in the open.
There are several limitations to our study. The study was conducted applying a novel taxonomy that identifies and classifies clinically relevant decisions in a substantially broader way than previous studies describing the number of decisions in medical encounters.
The taxonomy has not been tested in general practice or psychiatric practice, nor in other hospitals than the one in our study. From an observer perspective, we could not always determine for sure whether the decision had been made before the encounter or was made there and then. In cases where we were in doubt, we coded the decisions as being made in the present. We have studied a videotaped material collected over a limited period of time. Although there may be cultural differences varying over time, between hospitals, regions, countries and how healthcare is financed and legislated, we argue that the taxonomy captures a universal structure of how decisions are communicated in meetings between patients and physicians. Use in other settings is needed to further evaluate the taxonomy’s applicability, reliability and validity.
Patient–physician encounters contain a larger number of clinical decisions than described in previous studies. Comprehensive descriptions of how decisions both as judgements and actions are communicated in encounters may serve as a first step in assessing clinical practice with respect to efficiency and quality on a provider or system level.
We would like to thank Bård Fossli Jensen for recording the majority of the videotaped encounters, and Jennifer Gerwing for her contributions to the final version of the manuscript.
EHO and PG contributed equally.
Contributors EHO and PG contributed equally to this study. PG conceived the study and put together the study group. EHO analysed the first 30 videos and selected statements to be discussed in the study group. EHO, JCF, ES and PG took part in all seven group meetings, and all four authors independently analysed the 20 videos for inter-rater reliability measurements. Because of language barrier, RMF did not part take in analysis of the videos, but transcribed and translated statements were presented to RMF during the analytic phase. EHO analysed 372 videos. PG analysed every 20th of these videos to check for inter-rater drift. EHO and PG analysed the data with simple descriptive statistics. JSB performed multilevel statistical analyses. EHO, JCF, ES, RMF, JSB and PG analysed the data and reviewed the manuscript for its intellectual content. All authors had full access to all the data and take responsibility for the integrity of the data and accuracy of the analysis. EHO is guarantor.
Funding This project is funded by South Eastern Norway Regional Health Authority (grant number 2010003).
Disclaimer The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Competing interests None declared.
Ethics approval The study was approved by the Regional Ethics Committee for Medical Research of South-East Norway (1.2009/1415).
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.