To assess minimal medical statistical literacy in medical students and senior educators using the 10-item Quick Risk Test; to assess whether deficits in statistical literacy are stable or can be reduced by training.

Prospective observational study on the students, observational study on the university lecturers.

Charité University Medicine medical curriculum for students and a continuing medical education (CME) course at a German University for senior educators.

169 students taking part in compulsory final-year curricular training in medical statistical literacy (63% female, median age 25 years). Sixteen professors of medicine and other senior educators attending a CME course on medical statistical literacy (44% female, age range=30–65 years).

Students completed a 90 min training session in medical statistical literacy. No intervention for the senior educators.

Primary outcome measure was the number of correct answers out of four multiple-choice alternatives per item on the Quick Risk Test.

Final-year students answered on average half (median=50%) of the questions correctly while senior educators answered three-quarters correctly (median=75%). For comparison, chance performance is 25%. A 90 min training session for students increased the median percentage correct from 50% to 90%. 82% of participants improved their performance.

Medical students and educators do not master all basic concepts in medical statistics. This can be quickly assessed with the Quick Risk Test. The fact that a 90 min training session on medical statistical literacy improves students’ understanding from 50% to 90% indicates that the problem is not a hard-wired inability to understand statistical concepts. This gap in physicians’ education has long-lasting effects; even senior medical educators could answer only 75% of the questions correctly on average. Hence, medical students and professionals should receive enhanced training in how to interpret risk-related medical statistics.

The Quick Risk Test is the first test to measure minimal medical statistical literacy in physicians across disciplines.

Only a single site was included in each study.

A large student population was tested (N=169; ~60% of a cohort).

Only a small population of senior educators was tested.

No parallel instruments for convergent validity were tested at the same time.

For healthcare to be effective, medical professionals require literacy in health, the healthcare system and medical statistics. Health literacy entails basic knowledge about diseases and the ability to identify trustworthy medical and health information. Similarly, health system literacy entails basic knowledge of the healthcare system, the incentives that different players face and the effect that these can have on care (eg, defensive medicine). Finally, medical statistical literacy entails the ability to critically assess the numbers that are communicated in health information as well as basic statistical knowledge (eg, understanding of false-negative rates and false-positive rates).

Recent efforts to improve healthcare delivery have focused on decisional aspects rather than on health and medical statistical literacy. For example, physicians are urged to ensure that their care is in line with patients’ values and to transfer control over their patients’ lives to the patients themselves.

These are all crucial points that need to be addressed, yet they overlook one critical issue. Discussions about patient values require that physicians understand medical statistics, including the nature and likelihood of benefits and harms of diagnostic, intervention or treatment options, as well as the rates at which tests produce false results and the subsequent interpretation of positive and negative test results. More broadly, a healthcare system in which decisions are based on scientific evidence needs medical students and physicians who are literate in medical statistics. Physicians may well have high levels of health literacy and health system literacy yet an insufficient level of statistical literacy.

However, although frugal instruments exist to measure numeracy

The Quick Risk Test (

Quick Risk Test

Question | Possible answers |

1. A test’s sensitivity is a central criterion for its quality as a diagnostic tool. The sensitivity is | A) the proportion of people with a positive test result among those who are sick. *** |

2. A test’s specificity is a central criterion for its quality as a diagnostic tool. The specificity is | A) the proportion of people with a positive test result among those who are sick. |

3. Which test characteristic quantifies the probability that a person with a positive test result actually has the disease? | A) Positive predictive value *** |

4. Which test characteristic quantifies the probability that a person with a negative test result does not have the disease? | A) Sensitivity |

5. A medical test’s manufacturer tells you the sensitivity and the specificity of its test. You would like to tell your patient the probability that they are sick if they have a positive test result. Which measurement do you need for your calculation? | A) Mortality |

6. Mammography is often used as a screening-test to detect breast cancer early. The probability that a woman has breast cancer is 1%. When a woman has breast cancer her probability of receiving a positive mammogram is 90%. When a woman does not have breast cancer her probability of nevertheless receiving a positive mammogram is 9%. What is the best estimate for the number of women with a positive screening mammogram who actually have breast cancer? | A) 9 in 10 |

7. In a medical publication you read that screening with mammography lowers the probability of dying from breast cancer by 20%. This number is | A) a relative risk reduction. *** |

8. A patient asks you about the benefits of cancer screening. Which criterion should you consider here? | A) 5-year survival rate |

9. Imagine two groups of people who all die of cancer at age 70. In group A, cancer is detected via screening at the age of 60. In this group, the 5-year survival rate is 100%. Group B is not screened. In this group, cancer is detected at age 68. Everyone dies at age 70. Thus, the 5-year survival rate is 0%. Which bias explains why both groups have different 5-year survival rates? | A) Selection bias |

10. A higher screening rate results in more positive diagnoses. In screening, if anomalies are discovered, which because of their extremely slow growth would never cause symptoms or an early death, this is called | A) selection bias. |

Questions and multiple-choice answers of the 10-item Quick Risk Test (*** denotes the correct answer).

The test was also administered to 16 university professors, senior physicians and lecturers in medicine, all with a special interest in medical education (referred to as senior educators below) in a continuing medical education (CME) workshop at a German Faculty of Medicine held in October 2017. This group was tested only at the beginning of the workshop and participants were therefore not specifically trained on the topic by us. Participation was also voluntary in this group. In both groups, participation in the test was not required in order to receive the university credits or CME points that could be earned by participating in the courses. All students and educators were asked whether they would like to participate, meaning that both the student and the senior educator group were convenience samples. Both groups gave informed consent before participation.

Neither the patients nor the public were involved in these studies since it concerns medical students and medical professionals.

The data were mainly descriptively analysed using percentages, medians, ranges and IQRs. The item discrimination index (point-biserial correlation) was calculated to test whether the items discriminated between students of different performance levels. Finally, inferential statistics were used in the form of χ^{2} tests to test for group differences.

Among the students, 62.5% were female with a median age of 25 years (IQR=24–26) and 61.5% (n=104) completed both pretest and post-test. Among the senior educators, 44% were female with an age range of <30–65 years. Among the senior educators, we only asked participants to give age ranges in order to grant anonymity in the rather small sample. Neither group had any missing data. Final-year students answered on average half (median=50%) of the questions correctly. For comparison, chance performance is 25%. The data of students who dropped out were analysed only in the first round of the test. For the student population, the pretest median percentage (n=169) of correct responses across all 10 questions was 53.8% (IQR=44.4%–68.5%). Questions 6 and 8 (Bayes rule/mortality rate as measure of screening-success) obtained the fewest correct answers (22.5% and 17.2% correct), even below chance performance (25% with four multiple-choice answers). By contrast, questions 1 and 7 (sensitivity/relative risk reduction) obtained the highest number of correct answers (79.3% and 85.2% correct) (^{2}=0.8, df=1, p=0.4).

The proportion of correct answers to each of the 10 questions in the Quick Risk Test, for final-year medical students as well as professors, senior physicians and university lecturers. The test measures minimal medical statistical literacy, as defined by understanding 10 basic concepts. PPV, positive predictive value; NPV, negative predictive value.

Senior educators answered on average three-quarters of the questions correctly (median=75%). Among the senior educators, the median percentage correct across all 10 questions was 75% (IQR=62.5%–81.2%).

The students (n=104), but not the senior educators, then completed a 90 min training session on medical statistical literacy as part of the medical curriculum of the Charité University Medicine. The training session increased the median percentage correct from 50% to 90%. Eighty-two per cent of participants improved their performance. After the 90 min session (and an unrelated task of another 90 min), their performance improved to a median of 92.3% (IQR=83.2%–94.2%) correct answers per question (χ^{2}=300, df=1, p<2e-16). Additionally, each question obtained more correct answers after training, even the question with the smallest prepost difference in proportion correct answers, namely question 7 on relative risk (χ^{2}=7, df=1, p=0.004); 81.7% of the students performed better after the training than beforehand. Whereas question 6 on estimating the PPV for mammography screening (using Bayes rule) showed substantial improvement, from 22.5% to 87.5% correct answers, question 8 (46.2% correct) on the appropriate measure of screening success (mortality rate, not 5-year survival rate) still proved to be the most difficult one. The lead-time bias and the overdiagnosis bias also were among the more difficult concepts to understand.

Both students and senior educators struggled with applying Bayes rule to identify the PPV of a diagnostic test and with concepts relevant to screening, including the lead-time bias, overdiagnosis and identifying mortality rates as the most informative criterion to quantify the benefits of screening programmes. The training session for students included teaching how to use natural frequencies instead of conditional probabilities (such as sensitivity), an effective method for understanding how to calculate the PPV.

The proportion of correct answers to each of the 10 questions in the Quick Risk Test, for final-year medical students before and after a 90 min training session in risk literacy and diagnostic risk assessment. NPV, negative predictive value; PPV, positive predictive value.

The Quick Risk Test presented here measures minimal medical statistical literacy as defined by the 10 elementary concepts. It can also be used to track performance improvement in risk literacy training. In contrast to claims that lack of statistical literacy is something we must live with, the present study shows the encouraging result that final-year medical students can greatly improve their understanding of medical statistics in as little as 90 min. Note that the training took place a week prior to the students’ final year-exams without being relevant to these exams. Student engagement was increased by using real tests selected from areas of medicine taught in the final semester (eg, gynaecology), and dedicating the majority of the session to practice and discussion of the implications.

Although most questions and the test as a whole are able to discriminate between different levels of proficiency, this is not the main goal. Students and professionals should be able to answer all of the questions correctly and thereby demonstrate understanding of the 10 basic concepts that comprise

These results concern single-site studies with voluntary participation and thus risk of selection bias. The student study did, however, assess performance on over 50% of that year’s student cohort in the final year of studies. Nevertheless, it is an empirical question whether our results generalise to other student cohorts, which will depend on students’ statistical training in individual medical schools. In German-speaking Europe, statistical literacy is very rarely taught in medical school. We therefore expect similar results for other sites including students’ promising learning rate. Further validation samples in different educational systems are planned for future studies.

One limitation of our study on the student population is that it looked solely at a retention interval of 90 min. However, the fact that students practised the use of the tools (natural frequency trees and PPV/NPV curves) on actual tests using their real statistical properties supports long-term retention of these tools. With regard to natural frequency trees, studies showed that high application accuracy is maintained in a non-medical population after up to 3-month follow-up.

Medical statistical literacy is insufficient among medical students

The authors would like to thank Clara Schirren for testing the senior educators and Jana Hinneburg, Felix Rebitschek and Anna Held for their input concerning the wording of the items in the Quick Risk Test.

MAJ and GG developed the Quick Risk Test. MAJ analysed the data and wrote the manuscript. NK developed and ran the intervention study with the students and revised the manuscript. GG ran the study with the senior educators and revised the manuscript.

This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.

None declared.

Not required.

The study protocol for the students was approved by the Charité University Medicine’s ethics committee (ID: EA4/067/15) and the study protocol for the senior educators was approved by the ethics committee at the Max Planck Institute for Human Development (ID 19102017).

Not commissioned; externally peer reviewed.

No additional unpublished data are available from the study.