Objectives Most UK medical programmes use aptitude tests during student selection, but large-scale studies of predictive validity are rare. This study assesses the UK Clinical Aptitude Test (UKCAT: http://www.ukcat.ac.uk), and 4 of its subscales, along with individual and contextual socioeconomic background factors, as predictors of performance during, and on exit from, medical school.
Methods This was an observational study of 6294 medical students from 30 UK medical programmes who took the UKCAT from 2006 to 2008, for whom selection data from the UK Foundation Programme (UKFPO), the next stage of UK medical education training, were available in 2013. We included candidate demographics, UKCAT (cognitive domains; total scores), UKFPO Educational Performance Measure (EPM) and national exit situational judgement test (SJT). Multilevel modelling was used to assess relationships between variables, adjusting for confounders.
Results The UKCAT—as a total score and in terms of the subtest scores—has significant predictive validity for performance on the UKFPO EPM and SJT. UKFPO performance was also affected positively by female gender, maturity, white ethnicity and coming from a higher social class area at the time of application to medical school An inverse pattern was seen for a contextual measure of school, with those attending fee-paying schools performing significantly more weakly on the EPM decile, the EPM total and the total UKFPO score, but not the SJT, than those attending other types of school.
Conclusions This large-scale study, the first to link 2 national databases—UKCAT and UKFPO, has shown that UKCAT is a predictor of medical school outcome. The data provide modest supportive evidence for the UKCAT's role in student selection. The conflicting relationships of socioeconomic contextual measures (area and school) with outcome adds to wider debates about the limitations of these measures, and indicates the need for further research.
Statistics from Altmetric.com
Strengths and limitations of this study
This large-scale study is the first to link two national databases—the UK Clinical Aptitude Test (UKCAT) and the UK Foundation Programme (UKFPO).
This is the first study examining the predictive validity of the UKCAT and the impact of participants' sociodemographic characteristics against national performance indicators on exit from medical school.
Only medical students who were accepted into medical school and were in final year due to graduate in 2013 are included in this study.
The analysis considers only simple predictor–outcome correlations.
The reliability of the Educational Performance Measure (EPM) component of the UKFPO is unreported due to its diverse components, so we are therefore cautious in interpreting our findings.
Assessing the predictive validity and reliability of any medical school selection tool, to ensure it measures what it claims to measure, and does so robustly and fairly, is an important issue in terms of ensuring the best applicants are selected into medicine.1–4 However, the holy grail of an objective yet comprehensive assessment for medical student selection remains elusive. Prior academic attainment, the traditional basis for medical school selection, is now recognised as insufficient given grade inflation5 and clear evidence of the impact of sociodemographic factors on educational attainment and academic progression.6 There is also a growing realisation that there is more to being a capable medical student or doctor than academic performance.7–9 These factors have led to the development of a range of innovative selection methods such as structured multiple mini interviews (MMIs),10 ,11 selection centres,12 situational judgement tests (SJTs)13 ,14 and aptitude tests.15
In respect of one of these methods, aptitude tests, a recent systematic review of the effectiveness of selection methods in medical education and training concludes that there is a lack of consensus among researchers on their usefulness due to mixed evidence supporting their predictive validity and fairness.2 This conclusion reflects the current situation with the UK Clinical Aptitude Test (UKCAT), the aptitude test used in the selection process at the majority of UK medical schools.
Predictive validity studies have focused primarily on the relationship between UKCAT scores and performance on individual medical school examinations.16–20 These studies have reached varying conclusions: from suggesting that the test provides no significant prediction19 to claims of significant predictive ability.20 These contradictory results may be due to relatively low numbers, and hence limited statistical power, and/or be associated with the outcome measures used, rather than the UKCAT itself (the psychometric properties of local assessment are rarely presented by authors, so there is no way of knowing if these are themselves robust). Moreover, the generalisable findings from small-scale studies are limited as medical schools in the UK differ in terms of their curricula, the length of programmes and the timing and nature of student assessments. A recent study aimed to overcome some of these limitations by examining the relationship between UKCAT scores and first year performance across 12 UK medical schools, found that UKCAT scores were significant predictors of outcome.21 These findings are reassuring but clearly indicated the need for even larger scale studies to see the national picture given the differences across schools, as well as follow-up into later undergraduate years and postgraduate training.
In terms of fairness, the small number of studies which focus on the fairness of the UKCAT also provide conflicting results. James et al22 found that men, white students from higher socioeconomic status (SES) and independent or with grammar schooling gained higher marks in UKCAT (in the first cohort from 2006) and Tiffin et al23 found that male sex was more important in predicting the UKCAT scores. Turner and Nicholson24 concluded that the UKCAT can facilitate the independent selection of appropriate candidates for interview (in that students with lower UKCAT scores are independently less likely to be selected for interview) but that it is not predictive of success at interview. Tiffin et al25 found that the UKCAT may lead to more equitable provision of offers to those applying to medical school from lower sociodemographic groups. However, Lambe et al26 found that students from non-independent schools are disadvantaged in terms of UKCAT results and preparation for applying to medical school. One possible reason for this is the differing ways in which the UKCAT is used within the medical school selection process.27 However, notwithstanding this variation in usage, whether or not the UKCAT may disadvantage some candidate groups requires further investigation.
In short, notwithstanding substantial research efforts, which are considerable in comparison to other aptitude tests and most other selection instruments, the predictive validity and fairness of the UKCAT remains unclear. Investigating these issues robustly is now possible because of three factors. First, given that the UKCAT was introduced in 2006, students for whom the UKCAT was part of the selection process are now graduating. Second, the introduction of a standardised national process for selection into the stage of medical training in the UK, which immediately follows medical school, the Foundation Programme (UKFPO) (http://www.foundationprogramme.nhs.uk), enables studies examining the predictive validity of the UKCAT in relation to performance on exit from medical school. Finally, there is enthusiasm within the UK to work across agencies to integrate large-scale databases which will enable a range of large-scale longitudinal studies addressing questions of predictive validity and fairness in medical selection and training in the UK.
The aim of this paper is twofold. First, to evaluate the predictive validity of the UKCAT in relation to the UKFPO selection process measures across a large number of diverse medical schools and programmes. Second, to examine the relationships between a broad range of sociodemographic factors and performance on the UKFPO selection process measures.
Our sample was the 2013 cohort of UK medical students from 30 UKCAT medical programmes who used the UKCAT in their 2006, 2007 and 2008 selection processes due to graduate in 2013. Students may have sat in the UKCAT in different years as they may have deferred entry, been on a 4-year, 5-year or 6-year course and may have repeated one or more years of medical school. This was the first cohort for whom both UKCAT and UKFPO indicators were available. All students applying to medical schools in the UK had to sit the UKCAT in these years (including international students) unless it was impractical for them to travel to their nearest test centre. The ways in which medical schools used the UKCAT can be found in Adam et al.27
Working within a datasafe haven (to ensure adherence to the highest standards of security, governance and confidentiality when storing, handling and analysing identifiable data), routine data held by UKCAT and UKFPO were matched and linked.
The following demographic and pre-entry scores were collected: age on admission to medical school; gender; ethnicity; type of secondary school attended (ie, whether fee paying or non-fee paying); indicators of SES or socioeconomic classification (SEC), including Index of Multiple Deprivation (IMD) which is based on postcode and where lower quintile scores represent more affluent areas, and the National Statistics Socioeconomic Classification (NSSEC) which is based on parental occupation; domicile (UK, European Union (EU) or overseas); and academic achievement prior to admission, out of a maximum of 360 points (the Universities and Colleges Admissions Service (UCAS) tariff, which is a UK-wide means of differentiating students based on grades from various qualifications, in use since 1992 for all British university course applications, see http://web.ucas.com/). Finally, the UKCAT scores were collected for entry into data analysis. The UKCAT cognitive scores are used in the medical selection process (a separate UKCAT non-cognitive domain was included for research purposes only9). The UKCAT total (cognitive) score is out of 3600 points. This total is comprised of four cognitive domains—abstract reasoning (AR), DA, quantitative reasoning and verbal reasoning (VR). Each has a maximum score of 900 points, summed to give the total UKCAT score out of 3600 points.
The outcome measures were UKFPO selection scores. All medical students studying in the UK who wish to enter the UKFPO obtain two indicators of performance: an Educational Performance Measure (EPM) and the score they achieve for a SJT.
The EPM is a decile ranking (within each school) of an individual student's academic performance across all years of medical school except the final year, plus additional points for extra degrees, publications, etc. The total EPM score is based on three components, with a combined score of up to 50 points:
Medical school performance by decile (scored as 34–43 points);
Additional degrees, including intercalation (up to 5 points);
Publications (up to 2 points).
The UKFPO SJT is also scored out of 50 points. It30–33 is a type of aptitude test that assesses judgement required for solving problems in work-related situations. The SJT focuses on key non-academic criteria deemed important for junior doctors on the basis of a detailed job analysis (commitment to professionalism, coping with pressure, communication, patient focus and effective teamwork14 ,30). It presents candidates with hypothetical and challenging situations that they might encounter at work, and may involve working with others as part of a team, interacting with others and dealing with workplace problems. In response to each situation, candidates are presented with several possible actions (in multiple choice format) that could be taken when dealing with the problem described. It is administered to all final year medical students in the UK as part of the foundation programme application process, is taken in examination conditions, and consists of 70 questions to be answered in 2 hours 20 min (http://www.foundationprogramme.nhs.uk/pages/medical-students/SJT-EPM).
The EPM and SJT are summed to give the UKFPO score out of 100. No reliability score is available for the composite UKFPO.
All data were analysed using SPSS V.22.0. Pearson's or Spearman's rank correlation coefficients (dependent on distribution) were used to examine the linear relationship between each of SJT score and EPM and continuous factors such as UKCAT scores and preadmission academic scores and age. Pearson's correlation coefficients were computed to assess the relationship between continuous predictors and the outcome variables. Owing to the large sample size, small correlation coefficients were likely to reach statistical significance. Therefore, in terms of practical interpretation of the magnitude of a correlation coefficient, we defined, a priori, low/weak correlation as r=0.00–0.29, moderate correlation as r=0.30–0.49 and strong correlation as r≥0.50. Two-sample t-tests, analysis of variance, Kruskal-Wallis or Mann-Whitney U tests were used to compare UKFPO indices across levels of categorical factors as appropriate.
Multilevel linear models were constructed to assess the relationship between the independent variables of interest: UKCAT cognitive total, UKCAT cognitive domains and demographic variables with each of the four outcomes (SJT, EPM decile, EPM total and UKFPO total). Fixed-effects models were fitted first and then random intercepts and slopes were introduced using maximum likelihood methods. Models were adjusted for identified confounders such as gender, age at admission, IMD quintiles, year the UKCAT examination was taken and whether or not the student attended a fee-paying school. Note that NSSEC and ethnicity were dropped from the models due to issues with non-convergence—this could occur for a variety of reasons and it is not always clear why, although one of the more common reasons is presence of tiny numbers in subgroups. What is clear though is that the models stabilised once these two variables were removed. Interactions between our primary variables and year of UKCAT examination were tested using Wald statistics and were dropped from the models if not significant at the 5% level. Nested models were compared using information criteria such as the log-likelihood statistics, Akaike's information criteria and Schwarz's Bayesian information criteria.
The matching exercise (between UKFPO data and UKCAT data) succeeded in linking 6804 (82%) of UKFPO records. Where candidates were graduating from a UKCAT Consortium school, the match rate was 83%; where candidates were graduating from a non-Consortium school, the rate was 79%. However, UKCAT and UKFPO results were available for only 6294 UK medical students who started their final year of medical school in 2012, representing 80.6% of the total population of that year group of UK medical school graduates.
Table 1 shows the demographic profile of the sample. Most students were from the UK (n=5648, 92.0%). A majority of them (56%) were female and/or Caucasian (67.7%). Just under a third of students had attended a fee-paying school (28.3%—compared with the UK average of 7%). A majority of the graduating medical students were from higher SEC backgrounds (59.5% from IMD 1 or 2).
Table 2 provides an overview of candidate performance on the UKCAT cognitive tests, showing the domain and total scores. These were conducted in 2006, 2007 or 2008 depending on when the student entered medical school (see earlier).
In terms of outcome measures, as would be expected in a decile system such as the EPM, the percentage of graduating students within each decile per school were relatively constant (varying between 7.8 and 10.7). The median total EPM was 41.0 (out of a possible 34–50, IQR 38.0–44.0). SJT median was 40.9 (out of 0–50), IQR 38.7–42.9 and total UKFPO score median 81.9 (out of a possible 84–100), IQR 77.7–85.9. Approximately one-third (32.7%) of the sample had no additional EPM points, while 53.4% (n=3363) gained three or four points, which indicates they had either intercalated or entered medicine as an Honours graduate. Most (68.1%) did not gain any points for publications, while 23.3% gained one point, and gained 8.3% two points.
The relationship between demographic factors, UKCAT (cognitive) scores and UKFPO (EPM and SJT) scores are presented in table 3. EPM deciles are ranked in ascending order of magnitude (10 being the highest).
There were highly significant and strong correlations between UKCAT cognitive total and the individual UKCAT domain scores (r=0.64–0.71 (not shown in tabular form)) as previously found in other studies.16 ,18 ,22 EPM decile and total score showed very little correlation with any of the UKCAT cognitive tests or total. SJT showed a weak correlation with total UKCAT (r=0.208), also showing a weak correlation with the VR domain of the UKCAT (r=0.216; table 4).
Contextually, UKCAT score is out of 3600 with a range of scores from 1630 to 3380 in our data (a spread of 1750 points). An increase of 333, 500, 333 and 167 in the total UKCAT cognitive score increases the average SJT, EPM decile, EPM total and UKFPO scores by one mark, respectively. In other words, every unit increase in UKCAT score results in a 0.006 of a unit increase in UKFPO—this represents a difference of 10.5 points on UKFPO between the highest and lowest UKCAT score, with the range of 1750. Small differences in UKCAT points translate into meaningful prediction in terms of the four outcomes of SJT, EPM decile, EPM total and UKFPO score.
Table 5 shows that UKCAT cognitive total score, all cognitive domain scores, except AR for the SJT (particularly VR), more affluent IMD quintile, female gender, increasing age and the year in which the UKCAT was attempted were significantly and positively associated with all four outcome measures (the EPM total, EPM decile, SJT and UKFPO total score). On the other hand, those who attended a fee-paying school did significantly worse, on average, than those who attended non-fee-paying schools. This weaker performance was particularly apparent on the total EPM and hence the UKFPO total score.
Each increase in IMD quintile (where higher quintiles indicate lower socioeconomic class) gives 0.357 fewer UKFPO points, an average difference of 1.43 UKFPO points between the highest and lowest IMD quintiles after correcting for other variables. For EPM decile, total EPM and SJT, each increase in IMD quintile gives 0.18, 0.23 and 0.13 fewer points, respectively.
Increasing age conferred an advantage, with an increase in UKFPO of one point for each additional 6 years of age. There was also a difference in which year the students sat their UKCAT, with those sitting in 2008 doing better than those in earlier years, most notably in UKFPO with a difference of 3.5 marks between 2007 and 2008.
This is the first study examining the predictive validity of the UKCAT and the impact of participants sociodemographic characteristics against national performance indicators on exit from medical school. Furthermore, it does so in a dataset representing 82% of the total population of the 2013 cohort of UK medical school final year students.
We have shown that the UKCAT—as a total score and in terms of the four cognitive subtests—has small but significant predictive validity of performance throughout medical school (indicated by the EPM). The UKCAT total score and three of the four cognitive subtests (not AR) also have small but significant predictive validity of performance on the SJT. Both of these outcome measures are used as the basis for selection to the next stage of medical training, the UKFPO. As the UKFPO selection process includes a medical school academic performance measure, this represents the first national comparison of the relationship between UKCAT scores and medical school outcomes, and hence is akin to the numerous studies investigating the predictive validity of medical school selection tests in countries with national licensure examinations, such as those exploring the relationship between performance on the US Medical College Admission Test and US Medical Licensing Examination (USMLE) steps.15 ,34 ,35
It is crucial that any selection test is fair and adds value to the process. The SJT was designed to measure the attribute domains of commitment to professionalism, coping with pressure, effective communication, patient focus and working within teams as a first year doctor (http://www.isfp.org.uk), and so complement the academic ranking.30 ,36 Given this, should we have expected there to be a correlation between the UKCAT, an aptitude test and SJT, a test of the expression of job-specific personality traits in hypothetical situations? On the surface it would seem not, yet there is evidence that the ‘big five’ personality factors correlate with academic performance at medical school33 and personal attributes such as motivation, resilience, perseverance and social and communication skills are associated with positive academic and work-related outcomes for young people (see Gutman and Schoon37 for a recent review). The relationships between these different types of selection tool require further examination.
The patterns seen in the data align with the findings from earlier studies. We found that the UKCAT correlates with academic performance, which in this study is assessed by the EPM.17–21 Moreover, older students, female students and white students all perform better at medical school, with all effects being significant after taking educational attainment and UKCAT scores into account.16 ,17 ,36 ,38–43 The reasons for this differential performance of medical students remain unclear.
Our analysis adds that those living in more deprived areas on application (indicated by higher IMD quintile) also had significantly weaker performance on the SJT. An inverse pattern was seen for the type of secondary school attended. Those attending fee-paying schools performed significantly more weakly on the EPM decile, the EPM total and the total UKFPO score, but not the SJT, than those attending other types of school. (note that being in receipt of a UKCAT bursary was not significant but the numbers were so small (1.8%) that affected the statistical analysis). SJTs have been proposed as ‘fairer’ than academic and cognitively oriented assessment tools but close investigation of the literature highlights that, to date, studies of SJT fairness have looked mostly at ethnicity and gender.42 ,44–46 With the exception of Lievens et al,47 this is the first study to include measures of social class in the analysis of SJT performance. Our findings do not indicate a clear pattern of performance or that the SJT under study is any less affected by socioeconomic variables than more traditional assessments, but this clearly needs further investigation looking across cohorts and groups.
Those attending fee-paying schools performed significantly more weakly on the EPM decile, the EPM total and the total UKFPO score, but not the SJT, than those attending other types of school. This reflects the UKCAT-12 finding that those students coming from secondary schools where pupils tend to have higher preadmission grades do worse in first year at medical school.21 We are not sure why this is the case but one possibility, according to McManus et al,21 is higher-achieving secondary schools achieve their results in part by supporting students in ways which do not generalise or transfer to university where such support is not present. Interestingly, these data confirm that secondary school and IMD indicate or measure different factors, and add to the ongoing conversation about problems with all measures of SES.48 ,49 Moreover, given that UKFPO selection is obviously a competitive process, do lower scores on the various outcome measures mean that those medical students graduating from lower socioeconomic backgrounds may be less likely to get their first choice of programme? We need to follow-up this cohort, to investigate what this difference means in reality.
The strengths of this study are its large-scale, the integration of databases held by different agencies and its focus on the time of exit from medical school and selection to the next stage of postgraduate training in the UK. As with any study, there are limitations with this study. We have no way of knowing how the UKCAT was used to select the students in this sample, but it is likely that it was used in different ways by different medical schools.27 The study is restricted to medical school entrants and, therefore, it cannot look more generally at how social and other factors relate to educational attainment and UKCAT performance in the entire set of medical school applicants. Only medical students who were accepted into medical school and were in final year in 2013 are included in this study but UKCAT routine data indicates that there was no difference in UKCAT scores in population of those applying to medicine versus those who made it to final year, indicating that our sample is representative. The analysis also considers only simple predictor–outcome correlations. The outcome measures used—the UKFPO EPM and SJT are relatively new. However, we do have some indicators of their psychometric properties—predictive validity studies of the UKFPO SJT are just emerging, with the evidence indicating that higher SJT scores are associated with higher ratings of performance in foundation training.43 While we do not know the reliability or predictive validity of the EPM and performance postmedical school, and, this necessitates caution when interpreting our findings, the wider literature indicates that performance in medical school exit examinations can predict performance in examinations during training.21 ,50 ,51 This requires follow-up, which will now be possible via the advent of the UK Medical Education Database (UKMED: http://www.ukmed.ac.uk/).
In conclusion, this large-scale study, the first to link two national databases (UKCAT and UKFPO), has shown that UKCAT is a predictor of medical school outcome. The data provide ongoing supportive evidence for the UKCAT's role in student selection. The conflicting relationships of socioeconomic contextual measures (area and school) with outcome adds to wider debates about the limitations of these measures, and indicates the need for further research exploring how best to widen access to medicine.
The authors thank the UKCAT Research Group for funding this independent evaluation and thank Rachel Greatrix of the UKCAT Consortium for her support throughout this project, and their feedback on the draft paper. The authors also thank the UKFPO for data provision and their support throughout. They also thank Professor Amanda Lee and Katie Wilde for their input into the application for funding, and ongoing support.
Contributors JAC and RKM wrote the funding bid. SN advised on the nature of the data. RKM managed the data and carried out the preliminary data analysis under the supervision of DA. DA advised on all the statistical analysis and carried out the multivariate analysis. JAC wrote the first draft of the Introduction and Methods sections of this paper. RKM and DA wrote the first draft of the Results section, RKM and SN wrote the first draft of the Discussion and Conclusions sections. JAC edited the drafts. All authors reviewed and agreed the final draft of the paper.
Funding United Kingdom Clinical Aptitude Test Consortium (grant number RG10984-10).
Competing interests This study addressed a research question posed and funded by UKCAT, of which SN was Chair. JAC is a member of the UKCAT Research Committee.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.