Performance of the final model* in the training and test datasets
Type of dataset† | AROC (95% CI) | HL χ2 statistic | HL χ2 P value | Sensitivity (%) | Specificity (%) | PPV (%) |
Training dataset (n=1408) | 0.71 (0.68 to 0.73) | 3.87 | 0.868 | 64 | 63 | 64 |
Test dataset 1 (n=478) | 0.67 (0.62 to 0.71) | 6.00 | 0.647 | 60 | 59 | 60 |
Test dataset 2 (n=467) | 0.70 (0.66 to 0.75) | 6.09 | 0.637 | 63 | 67 | 63 |
Test dataset 3 (n=474) | 0.70 (0.65 to 0.74) | 3.56 | 0.895 | 62 | 63 | 62 |
Test dataset 4 (n=481) | 0.69 (0.64 to 0.73) | 3.26 | 0.917 | 62 | 64 | 62 |
Test dataset 5 (n=507) | 0.66 (0.60 to 0.70) | 5.31 | 0.724 | 60 | 58 | 60 |
*Final model—age, self-assessed health and history of depression or anxiety.
†The reduced number in each dataset is due to removal of observations that had structural zeros and variables with significant correlation to EOSS≥2 from candidate diagnostic modelling.
AROC, area under the receiver operating characteristic curve; HL, Hosmer–Lemeshow; PPV, positive predictive value.