Table 4

Validation statistics for the QAdmissions prediction algorithm in the QResearch and CPRD validation cohorts using (a) the score calculated using the GP-HES-linked data and (b) the score calculated using the GP data alone

	QResearch validation cohort		CPRD validation cohort
	HES-GP linked-data	GP data alone	HES-GP-linked data	GP data alone
Women
ROC statistic	0.773 (0.771 to 0.774)	0.764 (0.762 to 0.766)	0.771 (0.770 to 0.773)	0.764 (0.763 to 0.766)
R² (%)	40.6 (40.2 to 40.9)	37.3 (37.0 to 37.8)	40.5 (40.2 to 40.8)	37.6 (37.3 to 37.9)
D statistic	1.69 (1.68 to 1.70)	1.58 (1.57 to 1.59)	1.69 (1.68 to 1.70)	1.59 (1.58 to 1.60)
Men
ROC statistic	0.776 (0.774 to 0.778)	0.769 (0.767 to 0.771)	0.772 (0.771 to 0.774)	0.767 (0.765 to 0.768)
R² (%)	42.6 (42.2 to 42.9)	39.5 (39.1 to 39.9)	41.9 (41.6 to 42.2)	39.2 (38.9 to 39.5)
D statistic	1.76 (1.75 to 1.78)	1.65 (1.64 to 1.67)	1.74 (1.73 to 1.75)	1.64 (1.63 to 1.65)

CPRD, Clinical Practice Research DataLink; HES-GP, hospital episode statistics-general practitioner
Notes on understanding validation statistics: Discrimination is the ability of the risk prediction model to differentiate between patients who experience an admission event during the study and those who do not. This measure is quantified by calculating the area under the receiver operating characteristic curve (ROC) statistic, where a value of 1 represents perfect discrimination.
The D statistic is also a measure of discrimination which is specific to censored survival data. As with the ROC, higher values indicate better discrimination.
R² is another measure specific to censored survival data—it measures explained variation and higher values indicate more variation is explained.