Table 2

Discrimination of three models between cases and controls

ClassifierCodes onlyCodes and keywords
AUCSensitivitySpecificityPPV at 7.1% prevalenceAUCSensitivitySpecificityPPV at 7.1% prevalence
Random forest0.900.770.920.460.940.850.920.46
Logistic regression0.900.760.930.490.940.840.930.51
Naïve Bayes0.870.780.910.450.900.800.910.43
  • AUC, area under the receiver operating characteristic curve; PPV, positive predictive value.